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GENETIC CONTROL OF FLOWERING 
This invention relates to the genetic control of 
flowering in plants and the cloning and expression of 
genes involved therein. More particularly, the invention 
relates to the cloning and expression of the FCA gene of 
Ara±>idopsis thaliana, and homologues from other species, 
and manipulation and use of these genes in plants . 

Efficient flowering in plants is important, 
particularly when the intended product is the flower or 
the seed produced therefrom. One aspect of this is the 
timing of flowering: advancing or retarding the onset of 
flowering can be useful to farmers and seed producers. An 
understanding of the genetic mechanisms which influence 
flowering provides a means for altering the flowering 
characteristics of the target plant. Species for which 
flowering is important to crop • production are numerous, 
essentially all crops which are grown from seed, with 
important examples being the cereals, rice and maize, 
probably the most agronomically important in warmer 
climatic zones, and wheat, barley, oats and rye in more 
temperate climates. Important seed products are oil seed 
rape, sugar beet, maize, sunflower, soybean and sorghum. 
Many crops which are harvested for their roots are, of 
course, grown annually from seed and the production of 
seed of any kind is very dependent upon the ability of 
the plant to flower, to be pollinated and to set seed. In 
horticulture, control of the timing of flowering is 
important. Horticultural plants whose flowering may be 
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controlled include lettuce, endive and vegetable 
brassicas including cabbage, broccoli and cauliflower, 
and carnations and geraniums. 

Arabidopsis thaliana is a facultative long -day- 
plant, flowering early under long days and late under 
short days. Because it has a small, well -characterized 
genome, is relatively easily transformed and regenerated 
and has a rapid growing cycle, Arabidopsis is an ideal 
model plant in which to study flowering and its control. 

One of the genes required for rapid floral induction 
is the FCA gene (Koornneef et al 1991). Plants carrying 
mutations of this gene flower much later than wild- type 
under long photoperiods and short photoperiods. There is 
a considerable range in flowering time within different 
mutant fca alleles. The most extreme (fca-1) flowers 
under long photoperiods with up to 4 0 leaves whereas fca- 
3, fca-4 flower with -2 0 rosette leaves compared to 9 for 
wild-type Landsberg erecta) . The late flowering of all 
the fca mutants can be overcome to early flowering in 
both long and short photoperiods if imbibed seeds, or 
plants of different developmental ages, are given 3-8 
weeks at 4°C - a vernalization treatment (Chandler and 
Dean 1994) . 

We have cloned and sequenced the FCA gene of 
Arabidopsis thaliana, a homologue from Brassica and 
mutant sequences. 

According to a first aspect of the present invention 
there is provided a nucleic acid molecule comprising a 
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nucleotide sequence encoding a polypeptide with FCA 
function. Those skilled in the art will appreciate that 
"FCA function" refers to the ability to influence the 
timing of flowering phenotypically like the FCA -gene of 
Arabidopsis thaliana, especially the ability to 
complement an fca mutation in Arabidopsis thaliana. 

Nucleic acid according to the invention may encode a 
polypeptide comprising the amino acid sequence shown in 
Figure 2, or an allele, variant, derivative or mutant 
thereof. Particular variants include those wherein the 
amino acid residues up-stream of the third methionine 
and/or up-stream of the second methionine in the amino 
acid sequence of Figure 2 are not included. Variants, 
mutants and derivatives of nucleic acid encoding such 
shorter polypeptide are of course provided by various 
embodiments of the present invention. 

Nucleic acid according to the present invention may 
have the sequence of an FCA gene of Arabidopsis thaliana, 
or be a mutant, variant (or derivative) or allele of the 
sequence provided. Preferred mutants, variants and 
alleles are those which encode a protein which retains a 
functional characteristic of the protein encoded by the 
wild- type gene, especially the ability to promote 
flowering as discussed herein. Promotion of flowering 
may advance, hasten or quicken flowering. Other 
preferred mutants, variants and alleles encode a protein 
which delays flowering compared to wild- type or a gene 
with the sequence provided. Changes to a sequence, to 
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produce a mutant or variant, may be by one or more of 
insertion, deletion or substitution of one or more 
nucleotides in the nucleic acid, leading to the 
insertion, deletion or substitution of one or more amino 
5 acids. Of course, changes to the nucleic acid which make 
no difference to the encoded amino acid sequence are 
included. Particular variants, mutants, alleles and 
variants are discussed further below. 

A preferred nucleic acid sequence covering the 

10 region encoding the FCA gene is shown in Figure 1 and the 
predicted amino acid sequence encoding the FCA ORF is 
shown in Figure 2. Nucleic acid may be subject to 
alteration by way of subsitution of nucleotides and/or a 
combination of addition, insertion and/or substitution of 

15 one or more nucleotides with or without altering the 

encoded amino acids sequence (by virtue of the degeneracy 
of the genetic code) . 

Nucleic acid according to the present invention may 
comprise an intron, such as an intron shown in Figure 1, 

20 for instance intron 3 (as in various embodiments e.g. as 
illustrated herein) , whether or not the encoded amino 
acid sequence is altered. For example, the variant FCA 
a B , ■ whose nucleic acid sequence is shown in Figure 3, 
comprises intron 3 of the sequence of Figure 1, such that 

25 translation of the sequence results in a different amino 
acid sequence from that of Figure 2 (intron 3 of Figure 1 
contains a stop codon at 3026-3028 that is potentially 
used in transcripts) . 
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The present invention also provides a vector which 
comprises nucleic acid with any one of the provided 
sequences, preferably a vector from which polypeptide 
encoded by the nucleic acid sequence can be expressed. 
The vector is preferably suitable for transformation into 
a plant cell . The invention further encompasses a host 
cell transformed with such a vector, especially a plant 
cell. Thus, a host cell, such as a plant cell, comprising 
nucleic acid according to the present invention is 
provided. Within the cell, the nucleic acid may be 
incorporated within the chromosome . There may be more 
than one heterologous nucleotide sequence per haploid 
genome. This, for example, enables increased expression 
of the gene product compared with endogenous levels, as 
discussed below. 

A vector comprising nucleic acid according to the 
present invention need not include a promoter, 
particularly if the vector is to be used to introduce the 
nucleic acid into cells for recombination into the 
genome . 

Nucleic acid molecules and vectors according to the 
present invention may be provided isolated and/or 
purified from their natural environment, in substantially 
pure or homogeneous form, or free or substantially free 
of nucleic acid or genes of the species of interest or 
origin other than the sequence encoding a polypeptide 
able to influence flowering, eg in Arabidopsis thaliana 
nucleic acid other than the FCA sequence. Nucleic acid 
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according to the present invention may comprise cDNA, 
RNA, genomic DNA and may be wholly or partially 
synthetic. The term "isolate" may encompass all these 
possibilities. 
5 The present invention also encompasses the 

expression product of any of the nucleic acid sequences 
disclosed and methods of making the expression product by 
expression from encoding nucleic acid therefore under 
suitable conditions in suitable host cells, e.g. £. coli 

10 (see Example 7) . Those skilled in the art are well able 
to construct vectors and design protocols for expression 
and recovery of products of recombinant gene expression. 
Suitable vectors can be chosen or constructed, containing 
one or more appropriate regulatory sequences, including 

15 promoter sequences, terminator fragments, polyadenylation 
sequences, enhancer sequences, marker genes and other 
sequences as appropriate. For further details see, for 
example, Molecular Cloning: a Laboratory Manual: 2nd 
edition, Sambrook et al, 1989, Cold Spring Harbor 

20 Laboratory Press. Transformation procedures depend on the 
host used, but are well known. Many known techniques and 
protocols for manipulation of nucleic acid, for example 
in preparation of nucleic acid constructs, mutagenesis, 
sequencing, introduction of DNA into cells and gene 

25 expression, and analysis of proteins ,. are described in 
detail in Short Protocols in Molecular Biology, Second 
Edition, Ausubel et al . eds . , John Wiley & Sons, 1992. 
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The disclosures of Sambrook et al . and Ausubel et al . are 
incorporated herein by reference. 

Purified FCA protein, or a fragment, mutant or 
variant thereof, e.g. produced recombinant ly by' 
5 expression from encoding nucleic acid therefor, may be 
used to raise antibodies employing techniques which are 
standard in the art, as exemplified in Example 7. 
Antibodies and polypeptides comprising antigen-binding 
fragments of antibodies may be used in identifying 
10 homologues from other species as discussed further below. 

Methods of producing antibodies include immunising a 
mammal (eg human, mouse, rat, rabbit, horse, goat, sheep 
or monkey) with the protein or a fragment thereof. 
Antibodies may be obtained from immunised animals using 
15 any of a variety of techniques known in the art, and 

might be screened, preferably using binding of antibody 
to antigen of interest. For instance, Western blotting 
techniques or immunoprecipitat ion may be used (Armitage 
et al, 1992, Nature 357: 80-82) . Antibodies may be 
20 polyclonal or monoclonal. 

As an alternative or supplement to immunising a 
mammal, antibodies with appropriate binding specificity 
may be obtained from a recombinant ly produced library of 
expressed immunoglobulin variable domains, eg using 
25 lambda bacteriophage or filamentous bacteriophage which 
display functional immunoglobulin binding domains on 
their surfaces; for instance see WO92/01047. 
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Antibodies raised to a polypeptide or peptide can be 
used in the identification and/or isolation of homologous 
polypeptides, and then the encoding genes. Thus, the 
present invention provides a method of identifying or 
isolating a polypeptide with FCA function (in accordance 
with embodiments disclosed herein) , comprising screening 
candidate polypeptides with a polypeptide comprising the 
antigen-binding domain of an antibody (for example whole 
antibody or a fragment thereof) which is able to bind an 
FCA polypeptide or fragment, variant or variant thereof 
or preferably has binding specificity for such a 
polypeptide, such as having the amino acid sequence shown 
in Figure 2 or Figure 8b. Specific binding members such 
as antibodies and polypeptides comprising antigen binding 
domains of antibodies that bind and are preferably 
specific for a FCA polypeptide or mutant, variant or 
derivative thereof represent further aspects of the 
present invention, as do their use and methods which 
employ them. 

Candidate polypeptides for screening may for 
instance *be the products of an expression library created 
using nucleic acid derived from an plant of interest, or 
may be the product of a purification process from a 
natural source. 

A polypeptide found to bind the antibody may be 
isolated and then may be subject to amino acid 
sequencing. Any suitable technique may be used to 
sequence the polypeptide either wholly or partially (for 
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instance a fragment of the polypeptide may be sequenced) . 
Amino acid sequence information may be used in obtaining 
nucleic acid encoding the polypeptide, for instance by 
designing one or more oligonucleotides (e.g. a degenerate 
pool of oligonucleotides) for use as probes or primers in 
hybridisation to candidate nucleic acid, or by searching 
computer sequence databases, as discussed further below. 

The present invention further encompasses a plant 
comprising a plant cell comprising nucleic acid according 
to the present invention e.g. as a result of introduction 
of the nucleic acid into the cell or an ancestor thereof, 
and selfed or hybrid progeny and any descendent of such 
a plant, also any part or propagule of such a plant, 
progeny or descendant, including seed. 

The FCA gene encodes a large protein (796 amino 
acids shown in Figure 2) with homology to a class of 
proteins identified as RNA-binding proteins (Burd and 
Dreyfuss 1994) . These proteins contain 80 amino acid, RNA 
recognition motifs (RRMs ) and have a modular structure- 
they can contain several RNA binding domains and 
auxiliary domains rich in amino acids such as glycine, 
glutamine and proline. The RRM proteins can be divided 
into subfamilies based on homology within and around the 
RRM domains. The FCA protein is most homologous to a 
subfamily of RNA-binding proteins (cluster 1028.16; 
identified using the BEAUTY database search, Worley et 
al., 1995) exemplified by the Drosophila elav gene 
(Robinow et al . , 1988). Other members of this family 
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include the Drosophila sexlethal protein; the human 
nervous system proteins HuD, HuC, Hel-Nl, and Hel-N2; and 
the Xenopus proteins elrA, elrB, elrC, elrD and etr-1. 
FCA has. two RNA-binding domains while most of the members 
5 of elav gene family have three RNA-binding domains. The 
first two RNA-binding domains of elav family (and the 
spacing between the domains) is similar to the RNA- 
binding domains in the FCA protein. In common with the 
FCA protein the elav has a region with high glutamine 
10 content. There is also a 20 amino acid region near the C 
terminus of the FCA protein which shows strong homology 
to ORFs from two genes of unknown function from yeast and 
C. elegans. 

The FCA transcript is alternatively spliced. Five 
15 forms of the transcript are generated in cells. One, 

herein called FCA transcript /? is - 2kb and represents 
premature termination and polyadenylation within intron 
3. FCA a A and a B has 19 of 20 introns spliced out but 
intron 3 (2kb) remaining. FCA a A is the same as a B except 
20 at intron 11 where different 5' and 3' exon/intron 
junctions are used. FCA or A uses the 5' exon/intron 
junction at 7055 bp (genomic sequence Fig.l) and 3' 
exon/intron junction at 7377 bp. FCA a B uses the 5' 
exon/intron junction at 7130 bp (genomic sequence Fig.l) 
25 and 3' exon/intron junction at 7295 bp. FCA transcripts 7 A 
and y b .both have intron 3 removed and y a and y B use the 
same junctions around intron 11 as a A and a B , 
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respectively. Only 7 B encodes both RNA-binding domains and 
the conserved C- terminal domain (Figure 10) . 

RNA-binding proteins have been shown to be involved 
in several facets of post -transcriptional regulation. The 
5 RNP motif forms a (3 sheet RNA binding surface engaging 
the RNA as an open platform for interaction with either 
other RNA molecules or other proteins . One of the most 
well characterized genes encoding an RNP motif -containing 
protein is the Drosophila SEX-LETHAL gene (Bell et al 
10 198 8) . The SEX-LETHAL protein is involved in altering the 
splicing of its own and other transcripts within the 
pathway that determines sex in Drosophila. Only the 
alternatively spliced product gives an active protein. 
Thus this gene product is responsible for determining and 
15 maintaining the female state. Other RNA-binding proteins 
have been shown to function by localizing specific 
transcripts in the nucleus or preventing translation of 
specific transcripts. Six independently isolated fca. 
mutants have been described, and we have identified the 
2 0 sequence changes causing a reduction in FCA activity in 

three cases. The £ca-l mutation converted a C nucleotide 
at position 6861 (Figure 1) into a T. Thus a glutamine 
codon (CAA) is changed into a stop codon (TAA) . The jfca-3 
mutation converted a G nucleotide at position 5271 into 
25 an A. The effect of this mutation is to alter the 3' 
splice junction of intron 7 such that a new 3' splice 
junction is used 28 nucleotides into exon 8. The fca-4 
mutation is the result of a rearrangement (an inversion 
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taking the 3' end of the gene 250kb away) with the break- 
point at position 4570 (within intron 4). 

A further aspect of the present invention provides a 
method of identifying and cloning FCA homologies from 
plant species, other than Arabidopsis. thaliana which 
method employs a nucleotide sequence derived from that 
shown in Figure 1. Nucleic acid libraries may be screened 
using techniques well known to those skilled in the art 
and homologous sequences thereby identified then tested. 
The provision of sequence information for the FCA gene of 
Arabidopsis thaliana enables the obtention of homologous 
sequences from Arabidopsis and other plant species. In 
Southern hybridization experiments a probe containing the 
FCA gene of Arabidopsis thaliana hybridises to DNA 
extracted from flrassica rapa, Brassica napus and Brassica 
oleraceae. m contrast to most Arabidopsis genes, which 
are normally present on the B. napus genome in 6 copies, 
the FCA gene is present twice, on only one pair of 
chromosomes. An FCA homologue from Brassica napus has 
been isolated and sequenced and shows 86.1% average 
nucleotide sequence homology within the exons, 65.8% 
within introns and 78% identity at the amino acid level 
<87% similarity) . This Brassica gene fully complements a . 
mutation in the Arabidopsis FCA gene and can thus be 
considered as a fully functional homologue. Homologues 
have also been detected by Southern blot analysis from 
Antirrhinum, tobacco, sugarbeet, tomato, pea, wheat, 
maize, rice, rye, Lolium and oats. 
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The Brassica FCA homologue whose nucleotide sequence 
is given in Figure 8a, including the coding sequence, and 
whose amino acid sequence encoded by the sequence of 
Figure 8a is shown in Figure 8b, represents and -provides 
5 further aspects of the present invention in accordance 
with those disclosed for the Arabidopsis FCA gene. For 
example, mutants, alleles and variants are included, e.g. 
having at least 80% identity with the sequence of Figure 
8b, though high levels of amino acid identity may be 
10 limited to functionally significant domains or regions as 
discussed. 

The present invention also extends to nucleic acid 
encoding an FCA homologue obtained using a nucleotide 
sequence derived from that shown in Figure 1, or the 

15 amino acid sequence shown in Figure 2. Preferably, the 
nucleotide sequence and/or amino acid sequence shares 
homology with the sequence encoded by the nucleotide 
sequence of Figure 1, preferably at least about 50%, or 
at least about 60%, or at least about 70%, or at least 

20 about 75%, or at least about 78%, or at least about 80% 
homology, most preferably at least about 90% homology, 
from species other than Arabidopsis thaliana and the 
encoded polypeptide shares a phenotype with the 
Arabidopsis thaliana FCA gene, preferably the ability to 

25 influence timing of flowering. These may promote or 

delay flowering compared with Arabidopsis thaliana FCA 
and mutants, variants or alleles may promote or delay 
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flowering compared with wild-type. "Homology" may be 
used to refer to identity. 

In certain embodiments, an allele, variant, 
derivative, mutant or homologue of the specif ic 'sequence 
5 may show little overall homology, say about 20%, or about 
25%, or about 30%, or about 35%, or about 40% or about 
45%, with the specific sequence. However, in 
functionally significant domains or regions the amino 
acid homology may be much higher. Comparison of the 

10 amino acid sequences of the FCA polypeptides of the 
Arabidopsis thaliana and Brassic napus genes, as in 
Figure 9, reveals domains and regions with functional 
significance, i.e. a role in influencing a flowering 
characteristic of a plant, such as timing of flowering. 

15 Deletion mutagenesis, for example, may be used to test 

the function of a region of the polypeptide and its role 
in or necessity for influence of flowering timing. 

The nucleotide sequence information provided herein, 
or any part thereof, may be used in a data-base search to 

20 find homologous sequences, expression products of which 
can be tested for ability to influence a flowering 
characteristic. These may have FCA function or the 
ability to complement a mutant phenotype, which phenotype 
is delayed flowering, where the delay can be reversed by 

25 a vernalization treatment. 

Vernalization is well known in the art and 
appropriate conditions are at the disposal of skilled 
artisans. Plants may be vernalized at the seed stage, 
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immediately after sowing. It may be carried out for 8 
weeks, in an 8 hour photoperiod (e.g fluorescent light, 
PAR 9.5mmol m" 2 s" 1 , R/FR ratio 3.9) at a temperature of 
5oC +/-loc. 

In public sequence databases we recently identified 
several Arabidopsis cDNA clone sequences that were 
obtained in random sequencing programmes and share 
homology with FCA within both the RRM domains and in; the 
C-terminal regions. BLAST and FAST A searches of databases 
have identified 23 Arabidopsis expressed sequence tags 
(ESTs) identified. These clones have been obtained and 
used in low stringency hybridization experiments with 
different regions of the FCA gene (central and 3') . Eight 
clones show good homology to the 3' part of the FCA gene, 
two clones show good homology to the central part and one 
clone shows good homology to both (42 A 4 - another RNA- 
binding protein) . Similarly, among randomly sequenced 
rice cDNAs we have identified 10 rice ESTs. These 
hybridise to FCA genomic and cDNA clones under low 
stringency conditions. Five clones show good 
hybridization to FCA, particularly C1480. 

By sequencing homologues, studying their expression 
patterns and examining the effect of altering their 
expression, genes carrying out a similar function to FCA 
in regulating flowering time are obtainable. Of course, 
mutants, variants and alleles of these sequences are 
included within the scope of the present invention in the 
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same terms as discussed above for the Arabidopsis 
thaliana FCA gene. 

The high level of homology between the FCA genes of 
Arabidopsis thaliana and Brassica napus, as disclosed 
herein, may also be exploited in the identification of 
further homologues, for example using oligonucleotides 
(e.g. a degenerate pool) designed on the basis of 
sequence conservation. 

According to a further aspect, the present invention 
provides a method of identifying or a method of cloning a 
FCA homologue from a species other than Arabidopsis 
thaliana, the method employing a nucleotide sequence 
derived from that shown in Figure 1 or that shown in 
Figure 8a. For instance, such a method may employ an 
oligonucleotide or oligonucleotides which comprises or 
comprise a sequence or sequences that are conserved 
between the sequences of Figures 1 and 8a to search for 
homologues. Thus, a method of obtaining nucleic acid 
whose expression is able to influence a flowering 
characteristic of a plant is provided, comprising 
hybridisation of an oligonucleotide or a nucleic acid 
molecule comprising such an oligonucleotide to 
target /candidate nucleic acid. Target or candidate 
nucleic acid may, for example, comprise a genomic or cDNA 
library obtainable from an organism known to contain or 
suspected of containing such nucleic acid. Successful 
hybridisation may be identified and target /candidate 
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nucleic acid isolated for further investigation and/or 
use . 

Hybridisation may involve probing nucleic acid and 
identifying positive hybridisation under suitably 
5 stringent conditions (in accordance with known 

techniques) and/or use of oligonucleotides as primers in 
a method of nucleic acid amplification, such as PCR. For 
probing, preferred conditions are those which are 
stringent enough for there to be a simple pattern with a 

10 small number of hybridisations identified as positive 

which can be investigated further. It is well known in 
the art to increase stringency of hybridisation gradually 
until only a few positive clones remain. 

As an alternative to probing, though still employing 

15 nucleic acid hybridisation, oligonucleotides designed to 
amplify DNA sequences may be used in PCR reactions or 
other methods involving amplification of nucleic acid, 
using routine procedures. See for instance "PCR 
protocols; A Guide to Methods and Applications" , Eds. 

20 Innis et al , 1990, Academic Press, New York. 

Preferred amino acid sequences suitable for use in 
the design of probes or PCR primers are sequences 
conserved (completely, substantially or partly) between 
at least two FCA polypeptides able to influence a 

25 flowering characteristic, such as timing of flowering, 
e.g. with the amino acid sequences of Figures 2 and 8b 
herein . 
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On the basis of amino acid sequence information 
oligonucleotide probes or primers may be designed, taking 
into account the degeneracy of the genetic code, and, 
where appropriate, codon usage of the organism from the 
5 candidate nucleic acid is derived. 

Preferably an oligonucleotide in accordance with the 
invention, e.g. for use in nucleic acid amplification, 
has about 10 or fewer codons (e.g. 6, 7 or 8) , i.e. is 
about 30 or fewer nucleotides in length (e.g. 18, 21 or 
10 24} . 

Assessment of whether or not such a PCR product 
corresponds to resistance genes may be conducted in 
various ways. A PCR band from such a reaction might 
contain a complex mix of products. Individual products 

15 may be cloned and each one individually screened. It may 
be analysed by transformation to assess function on 
introduction into a plant of interest . 

Generally, nucleic acid according to the invention 
may comprise a nucleotide sequence encoding a polypeptide 

20 able to complement a mutant phenotype which is delayed in 
flowering, where that delay can be corrected by a 
vernalization treatment. Also the present invention 
provides nucleic acid comprising a nucleotide sequence 
which is a mutant or variant of a wild-type gene encoding 

25 a polypeptide with ability to influence the timing of 

flowering, the mutant or variant phenotype being delayed 
in .flowering with the timing of flowering being corrected 
by vernalization. These are distinguished from the CO 
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gene reported by Putterill et al 1995, Putterill et al 
1993 and the LD gene reported by Lee et al 1994. LD shows 
similar characteristics to the FCA gene in that a 
mutation in the gene confers late flowering that is 
5 corrected by a vernalization treatment, but LD requires a 
second gene product to influence flowering time in the 
Arahidopsis tha.lia.na Landsberg erecta ecotype (Lee et al 
1994, Koornneef et al 1994). Thus in many plant species- 
manipulation of the LD gene alone may not influence 

10 flowering time. The action of FCA is opposite in action 
to that of phytochromeB, in that mutations in PHYB (hy3) 
confer early flowering and introduction of an intact PHYB 
gene into hy3 mutants restores normal flowering time 
(Wester] et al 1994) . LD and CO are excluded from the 

15 ambit of the present invention. FCA and mutants, 

variants and alleles thereof may not complement an LD 
mutation. LD and mutants, variants and alleles thereof 
may not complement an FCA mutation. 

The FCA amino acid sequence is totally different 

20 from those of CO and LD. 

The action of FCA can also be distinguished from 
ectopic expression of meristem identity or MADS box genes 
that alter flowering time (Weigel and Nilsson 1995, Chung 
et al 1994, Mandel and Yanofsky 1995, Mizukama and Ma 

25 1992) . Apart from an early flowering phenotype, ectopic 
or overexpression of meristem identity or MADS box genes 
produces many additional perturbations to both the 
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vegetative and floral phenotype of the plant {eg. short 
stature, reduced apical dominance, sterile flowers) . 

Also according to the invention there is provided a 
plant cell having incorporated into its genome a sequence 
of nucleotides where different introns have been removed. 
A further aspect of the present invention provides a 
method of making such a plant cell involving introduction 
of a vector comprising the sequence of nucleotides into a 
plant cell and causing or allowing recombination between 
the vector and the plant cell genome to introduce the 
sequence of nucleotides into the genome. 

Plants which comprise a plant cell according to the 
invention are also provided, along with any part or 
propagule thereof, seed, selfed or hybrid progeny and 
15 descendants and any part or propagate thereof. 

The invention further provides a method of 
influencing the flowering characteristics of a plant 
comprising expression of a heterologous FCA gene sequence 
(or mutant, allele, variant or homologue thereof, as 
discussed) within cells of the plant. The term 
"heterologous" indicates that the gene/sequence of 
nucleotides in question have been introduced into said 
cells of the plant or an ancestor thereof, using genetic 
engineering, ie by human intervention. The gene may be on 
an extra-genomic vector or incorporated, preferably 
stably, into the genome. The heterologous gene may 
replace an endogenous equivalent gene, ie one which 
normally performs the same or a similar function in 
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control of flowering, or the inserted sequence may be 
additional to the endogenous gene. An advantage of 
introduction of a heterologous gene is the ability to 
place expression of the gene under the control of a 
5 promoter of choice, in order to be able to influence gene 
expression, and therefore flowering, according to 
preference. Furthermore, mutants and variants of the 
wild- type gene, eg with higher or lower activity than , 
wild- type, may be used in place of the endogenous gene. 

10 The principal flowering characteristic which may be 

altered using the present invention is the timing of 
flowering. Under-expression -of the gene product of the 
FCA gene leads to delayed flowering (as indicated by the 
fca mutant phenotype and Example 3, antisense 

15 experiments) that can be overcome to early flowering by a 
vernalization treatment; over-expression may lead to 
earlier flowering (Examples 2, 4 and 5) . This degree of 
control is useful to ensure synchronous flowering of male 
and female parent lines in hybrid production, for 

2 0 example. Another use is to advance or retard the 

flowering in accordance with the dictates of the climate 
so as to extend or reduce the growing season. This may 
involve use of anti-sense or sense regulation. 

The nucleic acid according to the invention, such as 

2 5 a FCA gene or homologue, may be placed under the control 
of an externally inducible gene promoter thus placing the 
timing of flowering under the control of the user. This 
is advantageous in that flower production, and subsequent 
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events such as seed set, may be timed to meet market 
demands, for example, in cut flowers or decorative 
flowering pot plants. Delaying flowering in pot plants 
is advantageous to lengthen the period available for 
transport of the product from the producer to the point 
of sale and lengthening of the flowering period is an 
obvious advantage to the purchaser. 

In a further aspect the present invention provides a 
gene construct comprising an inducible promoter 
operatively linked to a nucleotide sequence provided by 
the present invention, such as the FCA gene or 
Arabidopsis thaliana, a homologue from another plant 
species, e.g. a Brassica such as Brassica napus, or any 
mutant, variant or allele thereof. As discussed, this 
enables control of expression of the gene. The invention 
also provides plants transformed with said gene construct 
and methods comprising introduction of such a construct 
into a plant cell and/or induction of expression of a 
construct within a plant cell, by application of a 
suitable stimulus, an effective exogenous inducer. 

The term "inducible" as applied to a promoter is 
well understood by those skilled in the art. In essence, 
expression under the control of an inducible promoter is 
"switched on". or increased in response to an applied 
stimulus. The nature of the stimulus varies between 
promoters. Some inducible promoters cause little or 
undetectable levels of expression (or no expression) in - 
the absence of the appropriate stimulus. Other inducible 
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promoters cause detectable constitutive expression in the 
absence of the stimulus . Whatever the level of expression 
is in the absence of the stimulus, expression from any 
inducible promoter is increased in the presence -of the 
5 correct stimulus. The preferable situation is where the 
level of expression increases upon application of the 
relevant stimulus by an amount effective to alter a 
phenotypic characteristic. Thus an inducible (or 
"switchable" ) promoter may be used which causes a basic 
0 level of expression in the absence of the stimulus which 
level is too low to bring about a desired phenotype (and 
may in fact be zero) . Upon application of the stimulus, 
expression is increased (or switched on) to a level which 
brings about the desired phenotype. 
5 Suitable promoters include the Cauliflower Mosaic 

Virus 35S (GaMV 35S) gene promoter that is expressed at a 
high level in virtually all plant tissues (Benfey et al, 
1990a and 1990b) ; the cauliflower meri 5 promoter that is 
expressed in the vegetative apical meristem as well as 
0 several well localised positions in the plant body, eg 
inner phloem, flower primordia, branching points in root 
and shoot (Medford, 1992; Medford et al, 1991) and the 
Arabidopsis thaliana LEAFY promoter that is expressed 
very early in flower development (Weigel et al, 1992) . 
5 When introducing a chosen gene construct into a 

cell, certain considerations must be taken into account, 
well known to those skilled in the art. The nucleic acid 
to be inserted should be assembled within a construct 
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which contains effective regulatory elements which will 
drive transcription. There must be available a method of 
transporting the construct into the cell. Once the 
construct is within the cell membrane, integration into 
the endogenous chromosomal material either will or will 
not occur. Finally, as far as plants are concerned the 
target cell type must be such that cells can be 
regenerated into whole plants . 

Plants transformed with the DNA segment containing 
the sequence may be produced by standard techniques which 
are already known for the genetic manipulation of plants. 
DNA can be transformed into plant cells using any 
suitable technology, such as a disarmed Ti-plasmid vector 
carried by Agrobacterium exploiting its natural gene 
transfer ability (EP-A-270355 , EP-A-0116718, NAR 12(22) 
8711 - 87215 1984), particle or microprojectile 
bombardment (US 5100792, EP-A-444882, EP-A-434616) 
microinjection (WO 92/09696, WO 94/00583, EP 331083, EP 
175966) , electroporation (EP 290395, WO 8706614) or other 
forms of direct DNA uptake (DE 4005152, WO 9012096, US 
4684611) . Agrobacterium transformation is widely used by 
those skilled in the art to transform dicotyledonous 
species. Although Agrobacterium has been reported to be 
able to transform foreign DNA into some monocotyledonous 
species (WO 92/14828), microprojectile bombardment, 
electroporation and direct DNA uptake are preferred where 
Agrobacterium is inefficient or ineffective. 
Alternatively, a combination of different techniques may 
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be employed to enhance the efficiency of the 
transformation process, eg bombardment with Agrobacterium 
coated microparticles (EP-A-486234 ) or microprojectile 
bombardment to induce wounding followed by co-cultivation 
5 with Agrobacterium (EP-A-486233 ) . 

The particular choice of a transformation technology 
will be determined by its efficiency to transform certain 
plant species as well as the experience and preference of; 
the person practising the invention with a particular 

10 methodology of choice. It will be apparent to the skilled 
person that the particular choice of a transformation 
system to introduce nucleic acid into plant cells is not 
essential to or a limitation of the invention. 

In the present invention, over-expression may be 

15 achieved by introduction of the nucleotide sequence in a 
sense orientation. Thus, the present invention provides a 
method of influencing a flowering characteristic of a 
plant, the method comprising causing or allowing 
expression of the polypeptide encoded by the nucleotide 

20 sequence of nucleic acid according to the invention from 
that nucleic acid within cells of the plant. 

Under -express ion of the gene product polypeptide may 
be achieved using anti-sense technology or "sense 
regulation" . 

25 The use of anti-sense genes or partial gene 

sequences to down-regulate gene expression is now well- 
established. Double -stranded DNA is placed under the 
control of a promoter in a "reverse orientation" such 
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that transcription of the "anti -sense" strand of the DNA 
yields RNA which is complementary to normal mRNA 
transcribed from the "sense" strand of the target gene. 
The complementary anti- sense RNA sequence is thought then 
to bind with mRNA to form a duplex, inhibiting 
translation of the endogenous mRNA from the target gene 
into protein. Whether or not this is the actual mode of 
action is still uncertain. However, it is established 
fact that the technique works. See, for example, 
Rothstein et al, 1987; Smith et al, 1988; Zhang et al, 
1992, English et al 1996. The complete sequence 
corresponding to the coding sequence in reverse 
orientation need not be used. For example fragments of 
sufficient length may be used. It is a routine matter 
for the person skilled in the art to screen fragments of 
various sizes and from various parts of the coding 
sequence to optimise the level of anti -sense inhibition. 
It may be advantageous to include the initiating 
methionine ATG codon, and perhaps one or more nucleotides 
upstream of the initiating codon. A suitable fragment 
may have about 14-23 nucleotides, e.g. about 15, 16 or 
17. 

Anti-sense regulation may itself be regulated by 
employing an inducible promoter in an appropriate 
construct. 

Thus, the present invention also provides a method 
of influencing a flowering characteristic of a plant, the 
method comprising causing or allowing anti -sense 
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transcription from nucleic acid according to the 
invention within cells of the plant. 

When additional copies of the target gene are 
inserted in sense, that is the same, orientation as the 
target gene, a range of phenotypes is produced which 
includes individuals where over- expression occurs and 
some where under-expression of protein from the target 
gene occurs . When the inserted gene is only part of the 
endogenous gene the number of under -expressing 
individuals in the transgenic population increases. The 
mechanism by which sense regulation occurs, particularly 
down-regulation, is _ not well -understood. However, this 
technique is also well -reported in scientific and patent 
literature and is used routinely for gene control. See, 
for example, van der Krol, 1990; Napoli et al, 1990; 
Zhang et al, 19 92. 

Thus, the present invention also provides a method 
of influencing a flowering characteristic of a plant, the 
method comprising causing or allowing expression from 
nucleic acid according to the invention within cells of 
the plant to suppress activity of a polypeptide with 
ability to influence a flowering characteristic. Here the 
activity of the polypeptide is preferably suppressed as a 
result of under-expression within the plant cells. 

Modified version of FCA may be used in influencing a 
flowering characteristic of a plant. For example a 
mutant identified herein as fca-1, fca-3 or fca-4 may be 
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employed. The sequence changes resulting in these 
mutants and the resulting phenotypes are discussed above. 

Promotion of FCA activity to cause early flowering 



flowering under both long and short day conditions, 
indicating FCA involvement in promoting flowering 
const itutively. Double mutant experiments have also 
indicated that FCA function may be required both upstream 
10 and downstream of the gene products involved in 

conferring inflorescence/floral meristem identity eg. 
LEAFY, APETALA1 and TERMINAL FLOWER. Thus FCA function 
may be involved in the ability of meristems to respond to 
LEAFY, APETALA1 and TERMINAL FLOWER gene products. 
15 The fully spliced FCA transcript is present at very 

low abundance in all conditions so far analysed. Although 
the fca mutation is recessive transgenic fca plants 
homozygous for an introduced wild- type FCA gene flowered 
slightly earlier than plants carry one copy (Example 2) , 
20 suggesting that under some conditions the level of the 
FCA transcript is limiting to flowering time. This 
indicates that flowering may be manipulated by using 
foreign promoters to alter the expression of the gene. In 
addition, the majority of the transcript is present in a 
25 form that cannot make active protein. Thus alternative 

splicing may be a specific control mechanism to maintain 
relatively low levels of .the FCA protein. Alteration of 
this splicing pattern, for example by introducing an FCA 
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gene lacking introns into plants, may give much higher 
levels of the FCA protein which in turn would give 
accelerated flowering . 

5 Causing early flowering under non-inductive or inductive 
condi tions 

Wild- type Arabidopsis plants flower extremely 
quickly under inductive conditions and the FCA gene is 
expressed prior to flowering, although at a low level. 
10 The level of the FCA product may be increased by 

introduction of promoter, eg CaMV35S or meri 5, fusions. 
In addition, introduction of an FCA gene lacking introns 
may increase the level of FCA protein and cause early 
flowering in all conditions. 

15 

Inhibition of FCA activity to cause late flowering 

fca mutations cause late flowering of Arabidopsis. 
Transgenic approaches may be used to reduce FCA activity 
and thereby delay or prevent flowering in a range of 
20 plant species. A variety of strategies may be employed. 
This late flowering can then be overcome, if so desired, 
by giving the imbibed seed or plants of different ages, a 
vernalization treatment. 



Expression of sense or anti-sense RNAs 

In several cases the activity of endogenous plant 
genes has been reduced by the expression of homologous 
antisense RNA from a transgene, as discussed above. 
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Similarly, the expression of sense transcripts from a 
transgene may reduce the activity of the corresponding 
endogenous copy of the gene, as discussed above. 
Expression of an antisense transcript from the FCA gene 
has been shown reduce activity of the endogenous gene and 
cause late flowering (Example 3) . 

Expression of modified versions of the FCA protein 

RNA binding proteins have a modular structure in 
which amino acid sequences required for binding different 
RNA molecules are separate domains of the protein (Burd 
and Dreyfuss 1994) . This permits the construction of 
truncated or fusion proteins that display only one of the 
functions of the RNA binding protein. In the case of FCA, 
modification of the gene in vitro and expression of 
modified versions of the protein may lead to dominant 
inhibition of the endogenous, intact protein and thereby 
delay flowering. This may be accomplished in various 
ways, including the following: 

Expression of a truncated FCA protein. 

Some multi-RNP motif proteins can bind different RNA 
sequences simultaneously. Ul A for example, binds to Ul 
small nuclear RNA through its first RNA-binding domain 
and to pre-mRNA sequences through its second, thus 
controlling splicing (Burd and Dreyfuss 1994) . Expression 
of an FCA protein with only one of these RNP motifs may 
dominantly block FCA action, by preventing binding of the 
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full size FCA protein. Also expression of a mutant FCA 
protein not encoding the C terminal sequences may prevent 
the correct alignment of the binding of the RNA molecule 
and so again block wild- type FCA binding. 

Aspects and embodiments of the present invention 
will now be illustrated, by way of example, with 
reference to the accompanying figures. Further aspects 
and embodiments will be apparent to those skilled in the 
art. All documents mentioned in this text are 
incorporated herein by reference. 

In the Figures : 

Figure 1 shows a nucleotide sequence according to 
one embodiment of the invention, being the sequence of 
the genomic region encoding FCA obtained from Arabidopsis 
thaliana. Introns are shown in small letters, exons in 
capitals. Features: t(1118>- transcription start; 
□(1532-1534, 1568-70, 1601-1603) - putative translation 
start ATG; r /?<2753) - Poly A site of (3- transcript ; 
3tz (7056-7377) - alternative splicing around intron II; 

(8771-73) - translation stop TAA; r (9256) - Poly A 

site. Additional translational stop codon at 3026-3028 
within intron 3 . 

Figure 2 shows the predicted amino acid sequence 
derived from the nucleotide sequence encoding the FCA 
ORF. 
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Figure 3 shows the nucleotide sequence of the FCA a B 
gene, including 5' and 3' flanking sequences. The 
sequence within the ORF is that of one of the abundant 
transcripts, that is 18 introns have been spliced out but 
intron 3 remains. The position of termination of the 
other abundant transcript is indicated. Primer sequences 
are given in Table 2. Restriction sites: Sail - 352; 
Hindi I I - 776; Xbal - 1157; Hindi I I - 3125; Bglll - 3177; 
Clal - 3293; BamHI - 3549; HindHI - 4728; Spel - 5003. 
Other important landmarks: 12 93 -poly A tail added after 
this nucleotide in cDNA clone 77B or FCA transcript a; , 
897-5' splice site of intron 3: 2973 3' splice site of 
intron 3 . 

Figure 4 compares the FCA RRM motifs with those from 
the Drosophila SEX-LETHAL and TRA-2 genes. Also shown 
are the C-terminal amino acids with homology to yeast and 
C. elegans proteins. 

Figure 5 shows the recombination analysis to 
position the FCA gene. 

Figure 6 shows the complementation analysis to 
localize the FCA gene. 

Figure 7 shows the complexity and position of the 
FCA gene on the complementing cosmids. 

Figure 8 shows the nucleotide sequence of the 
Brassica napus FCA homologue and encoded polypeptide: 
Figure 8a - Brassica FCA nucleotide sequence including 
coding sequence; Figure 8b - polypeptide amino acid 
sequence encoded by coding sequence of Figure 8a. 
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Figure 9 shows an alignment of the Arabidopsis and 
Brassica FCA .amino acid sequences. Topline is 
Arabidopsis; bottom line is Brassica. 

Figure 10 shows the different transcripts produced 

5 from the FCA gene. open reading frame; * conserved 

region in C. elegans and yeast ESTs; Rl, R2 RNA-binding . 
domains 1 and 2 . 

EXAMPLE 1 - CLONING AND ANALYSIS OF THE FCA GENE 
10 Identification of a 300kb genomic region carrying the FCA 
gene of Aribidopsis thaliana. 

The fca mutation had been mapped relative to visible 
markers to 2 9cM on chromosome 4. In order to map the 
locus relative to molecular markers as a starting point 
15 for cloning by chromosome walking, the segregation 
pattern of RFLP markers mapping to the top half of 
chromosome 4 was analysed in 171 late (homozygous 
recessive class) flowering individuals from the F2 of a 
cross between the late flowering mutant jfca-1 (in a 
20 Landsberg erecta background) and the polymorphic early 

flowering ecotype Columbia. This analysis positioned the 
FCA locus in a 5.2cM interval between markers m326 and 
m226 . 

These markers were then used as the starting points 
25 for the chromosome walk. YAC clones containing these RFLP 
markers were identified by colony hybridization 
experiments. In the initial experiments , the YAC 
libraries used were the EG, EW and ABI libraries but as 
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another became available (yUP-Mayl992) they were 
incorporated into the analysis. Positively hybridizing 
YAC clones were confirmed using Southern blot analysis. 
They were sized using PFGE and Southern blot analysis and 
5 then end-probes were generated using either inverse PCR 
or left -end rescue for use in chromosome walking 
experiments. In the majority of cases, each step in the 
walk was covered by two independent YAC clones to avoid 
false linkages generated by chimaeric YAC clones. These 

10 constituted a significant fraction of the EG, EW and yUP 
libraries and complicated the assembly of the YAC contig. 
The result of the generation and analysis of 65 end- 
probes was a YAC contig covering the m326-m226 interval 
that included 57 YAC clones. 

15 Polymorphisms between Landsberg erecta and Columbia 

were determined for the left end-probe of EG9D2, right 
end-probe of YAC clone yUP13C7, right end-probe of YAC 
clone yUP3F7 and right end-probe of YAC clone EW20B3. 
Analysis of the segregation pattern of these markers on 

20 pooled progeny of recombinants with cross-over points 
mapping in the m32 6-m226 interval defined the region 
carrying the FCA gene to between the polymorphisms 
identified by yUP3F7RE and m226. This interval was 
covered by two overlapping YAC clones EW20B3 and 

25 ABI10C10. 

In order to further define the position of the FCA - 
gene, more probes were required that mapped within the 
two overlapping YAC clones. This was achieved by using 
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end-probes from YAC clones ABI3C4, ABI6C3, a random Sau3A 
fragment from YAC clone EW20B3 (W5) and two cosmids cAtA2 
and gl9247. Restriction maps for Smal, Mlul and Pad were 
constructed and used to position the probes within the 
5 YAC clones. 

Additional recombinants, where the cross-over point 
mapped close to the FCA locus, were generated by 
selecting individual plants that were arabinose resistant 
and had an early/intermediate flowering from the F2 

10 generation of a cross between fca (in Landsberg erecta) 
and aral (in Columbia) . Progeny of these were checked to 
confirm that they were homozygous for the arabinose 
resistance allele and heterozygous for the fca mutation. 
Three of these individuals (A2/7, Al/8 and A4/7) were 

15 analysed with the RFLP markers 3F7RE, W5 , cAt A2 , 1924 7, 
3C4LE, 6C3LE and 226. This defined the north end of the 
genomic region carrying the FCA gene as within the 
cosmids cAtA2 and 19247. This information is summarized 
in Fig. 5. 

20 

Complementation analysis to define the FCA gene. 

The two YAC clones EW20B3 and ABI10C10 were gel- 
purif ied and hybridized to filters carrying 25500 cosmid 
clones that contained 15-2 0kb of Arabidopsis thaliana 
2 5 Landsberg erecta genomic DNA. This cosmid library was 
constructed in a new vector (04541) by cloning a 1 . 6kb 
Bglll fragment from pHC79 carrying the lambda cos 
fragment into in the vector pSLJ1711. The resulting 
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highly stable cosmid cloning vehicle carries 
Agrobacterium border sequences for transfer of DNA into 
plant chromosomes, a 35S-NPTII plant selectable marker, 
lacz-laci sequences for the blue/white insert selection 
in E.coli and a polylinker with 7 cloning sites. 

Positively hybridizing colonies were analysed by 
hybridizing each clone to Southern blots carrying all the 
cosmid clones digested with a Hindlll, EcoRI and BaraHI. 
This generated a restriction map for the insert of each 
cosmid and indicated which clones carried overlapping 
inserts. The cosmids were also run alongside plant DNA 
and hybridized with the cosmid to confirm that the cosmid 
insert was colinear with the plant DNA. The two cosmid 
clones, cAtA2 and cAtBl, mapping to this interval were 
isolated from a different cosmid library (Olszewski and 
Ausubel 1988) . The result of this analysis was a cosmid 
contig covering the 3 00kb interval in which the FCA locus 
had been defined. 

Six mutant fca alleles were available, two of which 
had been generated by FN irradiation and one by X-ray 
irradiation. Irradiation- induced mutations are frequently 
associated with genomic rearrangements or deletions. In 
case this would further refine the location of the FCA 
gene, the genomic region covered by the YAC clones EW20B3 
and ABIlOClO-was examined in all six alleles. The two YAC 
clones were hybridized to PFGE Southern blots carrying 
DNA from the different alleles digested with Smal and 
Mlul. A ~50kb Mlul fragment was found to be slightly 
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smaller in the jfca-4 allele. Further analysis by 
hybridization of cosmid clones, corresponding to the 
region showing the difference, indicated that part of the 
alteration had occurred in a 1.9kb BamHI fragment carried 
in cosmids cAtA2 and 19247. This focused our efforts in 
the first complementation experiments to cosmid clones at 
the north end of the contig. 

Eleven cosmid clones shown in Fig 6, starting with 
those at the left end, were introduced into the 
Arabidopsis fca-1 mutant using the root explant 
transformation procedure (Valvekens et al 1988) . Seed 
were collected from self -fertilized kanamycin resistant 
individuals and analysed with respect to their kanamycin 
segregation and flowering time. The number of 
transf ormants showing complementation to early flowering 
for each cosmid is shown in Figure 6 . The four cosmids 
that resulted in complementation mapped to the end of the 
genomic region where the inversion in the :fca-4 allele 
mapped . 

Identification of the FCA gene. 

The complete genomic sequence of Columbia allele 
corresponding to the genomic region within the 
complementing cosmid clones was obtained through the 
efforts of the Arabidopsis sequencing initiative centred 
within this department (G. Murphy pers . comm.) . The 
majority of the genomic region contained in the 
complementing cosmids is carried on three BamHI 
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restriction fragments, 4, 1.9 and 2kb. These were 
isolated and hybridized either separately or pooled to 
lxlO 6 phage clones of the PRL-2 cDNA library. This 
library had been made from pooled RNA samples and was 
5 made available by Tom Newman (Michigan) . Four clones 

hybridizing to the 2kb BamHI fragment and 3 to the 4 and 
1.9kb fragments were isolated and characterized. They 
identified two cDNA clones with insert sizes -1700bp and 
13 50bp. Analysis of the sizes of the transcripts 
10 hybridizing to these two cDNA clones showed that one (in 
.fca-4) was reduced in size relative to the other alleles 
and wild-type and so this cDNA clone was assigned to the 
FCA gene. The other clone showed no differences and was 
termed 77B 

15 The transcript size of the putative FCA gene was 

>3kb indicating that the cDNA clone was not full length. 
The cDNA clone was sequenced and found to encode an 
insert of 1811bp. Primers were designed from the genomic 
sequence (marked BamX primer on Figure 3) and the 5' end 

20 of the cDNA sequence (marked IanRTl and IanRT2 on Figure 
3) . First strand cDNA was made using the IanRT2 primer to 
prime RNA isolated from wild-type seedlings (2 leaf 
stage) . This was used with primers BamX and IanRT2 to PGR 
amplify a fragment detected as a faint band on an 

25 ethidium bromide stained gel. The PCR product was diluted 
1/300 and reamplified using primers BamX and IanRTl. The 
product from this reaction was end-filled using T4 DNA 
polymerase and cloned into the EcoRV site of the general 
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cloning vector Bluescript KSII (Stratagene) . The product 
was sequenced and found to be colinear with the genomic 
sequence and extend the sequence of the cDNA clone by 
735bp. 

5 The sequence was compared to all available sequences 

using BlastX, BlastN and TBlastN. Significant homologies 
were detected in the TBlastN search to a class of 
proteins previously defined as RNA binding proteins. The 
characteristic of these proteins is the presence of one 
10 or more RRM motifs made up of conserved amino acids 

covering an 8 0 amino acid region (shown in Fig. 4) . The 
positioning of sub-motifs RNP2 and RNP1 and individual 
conserved amino acids is always maintained within the 
whole RRM motif. Translation of the sequence of the FCA 

15 cDNA clone extended in the RT-PCR experiments showed the 
presence of multiple translation stop codons in the 5' 
region of the sequence. The first methionine residue 
downstream of the last translation stop codon and in 
frame with the rest of the FCA protein was located in the 

20 middle of the RRM motif, splitting RNP2 and RNP1 . The 
strong homology of the RRM motif to other RNA binding 
proteins suggested that this MET residue was not the 
beginning of the FCA protein. In addition, the 
transcripts of a large number of RNA-binding proteins are 

25 alternatively spliced to yield active and inactive 

products. The splicing is then regulated, often in an 
autoregulatory fashion, to control the production of the 
active protein. These facts suggested that the FCA 
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transcript generated in the RT-PCR experiments contained 
an intron, just upstream of the RRM motif. 

In order' to test this hypothesis, several primers 
were designed from the genomic sequence for use "in 
further RT-PCR experiments. First strand cDNA was made 
from RNA isolated from seedlings (4 leaf stage), primed 
with random hexamers (Boehringer) . Primers lying within 
the sequence 5' to the FCA cDNA up to the 3' end of the 
77B cDNA (the other cDNA clone hybridizing to the 
complementing cosmid clones) , together with IanRTl gave 
amplification products of the expected size from the 
genomic sequence but did not yield smaller products as 
would be expected from a transcript in which an 
intervening intron had been spliced out . A primer lying 
within the 77B cDNA clone marked as cDNAII-BamHI (in 
Fig. 3) was then used in conjunction with the IanRTl 
primer. No band was visible on an ethidium bromide 
stained agarose gel after 30 cycles of amplification. The 
PCR reaction was then diluted 1/300 and re-amplified 
using primers cDNAII-1 and RevEx4 (shown in Figure 3) . 
The PCR product was digested with Sail and Bglll 
restriction enzymes and cloned into Sail and BamHI 
digested BluescriptKSII plasmid. Sequence analysis of the 
760bp product and comparison to the genomic sequence 
revealed that a 2kb intron had been spliced out to join 
the ORF within the 77B cDNA to that carrying the RRM 
motif in the FCA gene. This splicing revealed the 
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presence of a second intact RRM motif interrupted by 
intron 3, 

Direct comparison of the FCA sequence with that of 
LUMINIDEPENDENS and CO, the other flowering time genes 
cloned from Arabidopsis (Lee et al, 1994, Putterill et al 
1995) , detected no significant homology. 

Mutations in the fca mutant alleles. 

cDNA was made from RNA isolated from the mutant 
alleles. This was amplified using cDNAI I - BamHI and cDNA- 
3 'a: BamX and IanRTl; fca5'-l and fca3'-a (positions 
indicated on Fig 3) . The resulting PCR- fragments were 
cloned and sequenced and compared to the sequence of the 
wild-type Landsberg erecta transcript. The jfca-1 mutation 
converted a C nucleotide at position 68 61 into a T. Thus 
a glutamine codon (CAA) is changed into a stop codon 
(TAA) . The fca-3 mutation converted a G nucleotide at 
position 5271 into an A. The effect of this mutation is 
to alter the 3' splice junction of intron 7 such that a 
new 3' splice junction is used 28 nucleotides into exon 
8. The fca-4 mutation is the result of a rearrangement 
with the break-point at position 4570 (within intron 4) . 

EXAMPLE 2 - ISOLATION AND SEQUENCE ANALYSIS OF THE 
BRASSICA NAPUS HOMOLOGUE . 

A Brasslca napus genomic library constructed from 
Sau3A partially digested DMA cloned into lambda 
DASH R II /BamHI vector (Stratagene) was obtained. The 
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library was screened using the 1811bp FCA cDNA clone. A 
clone carrying a 12kb insert was isolated which 
hybridized to the FCA cDNA clone and the 77B cDNA clone. 
The lambda clone was digested with Sail which released 
5 the full length 12kb Brassica insert and this was cloned 
into Bluescript KSII. Restriction fragments of this clone 
(a combination of EcoRI, SacI and BamHI) were subcloned 
into BluescriptKSII and sequenced. 

The 12kb Brassica fragment was also subcloned into 

10 the Xhol restriction site of the Agrobacterium binary 
vector pSLJ1714 (Jones et al 1992), for transformation 
into the fca mutant. When introduced into the fca-4 
mutation, using root explant transformation, progeny of 
the transformant segregated early flowering plants. These 

15 flowered with a mean of 8.3 leaves compared to wild-type 
Landsberg erecta grown alongside with 9 . 1 leaves and fca- 
4 with 24.1 leaves. Thus the Brassica FCA gene fully 
complements the fca-4 mutation. 

2 0 Expression of FCA mRNA 

PolyA mRNA was isolated from a range of 
developmental stages: 2 leaf, 4 leaf, 6 leaf and 10 leaf, 
roots and inflorescences, fractionated on Northern blots 
and hybridized with the 1811bp FCA cDNA clone. The 

25 combined FCA transcript y was present at approximately 
the same amount in all tissues examined except for the 
inflorescences where expression was slightly lower. The 
prematurely polyadenylated transcript jS was detected 
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using 77B cDNA clone as a probe. The (3 transcript was 
-20 -fold more abundant than T A + b* Transcripts or A + B 
containing intron 3 were not detected on a northern blot 
and could only be found using RT-PCR. 

FCA expression has also been analysed using RNase 
protection assays. Using a probe (725 bp to 1047 bp from 
7 B construct) the *y A + B transcripts were detected at 
similar levels in a range of developmental stages in both 
long and short day photoperiods , and at lower levels in 
rosettes and inflorences of mature plants. The (3 
transcript was at a higher level in these tissues 
consistent with the northern blot analysis. 

METHODS FOR EXAMPLES 1 AND 2 
15 Growth conditions and measurement of flowering- time 

Flowering time was measured under defined conditions 
by growing plants in Sanyo Gallenkamp Controlled 
Environment rooms at 20°C. Short days comprised a 
photoperiod of 10 hours lit with 400 Watt metal halide 
20 power star lamps supplemented with 100 watt tungsten 

halide lamps. This provided a level of photosynthetically 
active radiation (PAR) of 113.7 jxmoles photons m-2s-l and 
a red: far red light ratio of 2.41. A similar cabinet and 
lamps were used for the long day. The photoperiod was for 
25 10 hours under the same conditions used for short days 
and extended for a further 8 hours using only the 
tungsten halide lamps. In this cabinet the combination of 
lamps used for the 10 hour period provided a PAR of 92.9 
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/itnoles photons m-2 s-1 and a red: far red ratio of 1.49. 
The 8 hour extension produced PAR of 14.27 /imoles m-2 s-1 
and a red: far-red ratio of 0,66. 

The flowering times of large populations of plants • 
5 were measured in both greenhouse and cabinet conditions. 
Flowering time was measured by counting the number of 
leaves, excluding the cotyledons, in the rosette and on 
the inflorescence. Leaf numbers are shown with the 
standard error at 95% confidence limits. The number of 
10 days from sowing to the appearance of the flower bud was 
also recorded, but is not shown. The close correlation 
between leaf number and flowering time was previously 
demonstrated for Landsberg erecta and fca alleles 
(Koorneef et al, 1991) . 

15 

Cosmid and RFLP markers. 

DNA of lambda clones m210, m326, m580, m226 were 
obtained from Elliot Meyerowitz (Caltech, Pasadena) . 
Total DNA was used as radiolabelled probe to YAC library 

20 colony filters and plant genomic DNA blots. Cosmids 

gl0086, g4546, g4108, gl9247 were obtained from Brian 
Hauge and Howard Goodman (MGH, Boston) , cultured in the 
presence of 30 mg/1 kanamycin, and maintained as glycerol 
stocks at - 70°C. Total cosmid DNA was used as 

25 radiolabelled probe to YAC library colony filters and 
plant genomic DNA blots. Cosmid clones cAtA2 .and cATBl 
were obtained from Chris Cobbett (University of 
Melbourne) and cultured in the presence of 10mg/l 
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tetracycline. Cosmid pCITd23 was provided by Elliot 
Meyerowitz (Caltech, Pasadena) , cultured in the presence 
of 10 0 fJLg/ml streptomycin/spectinomycin and maintained as 
a glycerol stock at - 70° C. pCIT3 0 vector sequences 
5 share homology to pYAC4 derived vectors, and therefore 
YAC library colony filters were hybridised with insert 
DNA extracted from the cosmid. Total DNA of pCITd2 3 was 
used as radiolabelled probe to plant genomic DNA blots. 

10 YAC libraries. 

The EG and ABI libraries were obtained from Chris 
Somerville (Michigan State University) . The EW library 
was obtained from Jeff Dangl (Max Delbruck Laboratory, 
Cologne) and the yUP library from Joe Ecker (University 

15 of Pennsylvania) . Master copies of the libraries were 

stored at -70°C (as described by Schmidt et al . Aust . J. 
Plant Physiol. 19: 341-351 (1992)). The working stocks 
were maintained on selective Kiwibrew agar at 4°C. 
Kiwibrew is a selective, complete minimal medium minus 

20 uracil, and containing 11% Casamino acids. Working stocks 
of the libraries were replated using a 96 -prong 
replicator every 3 months. 

Yeast colony filters . 
25 Hybond-N (Amersham) filters (8cm x 11cm) containing 

arrays of yeast colony DNA from 8-24 library plates were 
produced and processed (as described by Coulson et al . 
Nature 335:184-186 (1988) and modified (as described by 
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Schmidt and Dean Genome Analysis, vol.4: 71-98 (1992)). 
Hybridisation and washing conditions were according to 
the manufacturer's instructions. Radiolabeled probe DNA 
was prepared by random -hexamer labelling. 

Feast chromosome preparation and fractionation by pulsed 
field gel electrophoresis (PFGE) . 

Five millilitres of Kiwibrew was inoculated with a 
single yeast colony and cultured at 30 *C for 24 h. Yeast 
spheroplasts were generated by incubation with 2.5mg/ml 
Novozym (Novo Biolabs) for 1 h at room temperature. Then 
1 M sorbitol was added to bring the final volume of 
spheroplasts to 50 fil. Eighty microlitres of molten LMP 
agarose (1% InCert agarose, FMC) in 1 M sorbitol was 
added to the spheroplasts, the mixture was vortexed 
briefly and pipetted into plug moulds. Plugs were placed 
into 1.5ml Eppendorf tubes and then incubated in 1 ml of 
1 mg/ml Proteinase K (Boehringer Mannheim) in 100 mMEDTA, 
pH 8, 1% Sarkosyl for 4 h at 50°C. The solution was 
replaced and the plugs incubated overnight. The plugs 
were washed three times for 3 0 min each with TE and twice 
for 3 0 min with 0.5 x TVBE. PFGE was carried out using 
the Pulsaphor system (LKB) . One-third of a plug was 
loaded onto a 1% agarose gel and electrophoresed in 0 . 5 x 
TBE at 170 V,20 s pulse time, for 36 h at 4°C. DNA markers 
were concatemers of lambda DNA prepared as described by 
Bancroft and .Wo Ik, Nucleic A Res. 16:7405-7418 (1988) . 
DNA was visualised by staining with ethidium bromide. 
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Yeast genomic DNA for restriction enzyme digestion and 
inverse polymerase chain rection (IPCR) . 

Yeast genomic DNA was prepared essentially as 
described by Heard et al . (1989) except that yeast 
spheroplasts were prepared as above. Finally, the DNA was 
extracted twice with phenol/chloroform, once with 
chloroform and ethanol precipitated. The yield from a 5ml 
culture was about 10/ig DNA. 

Isolation of YAC left-end probes by plasmid rescue. 

Plasmid rescue of YAC left -end fragments from EG, 
ABI and EW YACs was carried out as described by Schmidt 
et al. (19 92) . IPCR was used to generate left and right 
end fragments using the protocol and primers described in 
Schmidt et al (1992) . 

Gel blotting and hybridisation conditions . 

Gel transfer to Hybond-N, hybridisation and washing 
conditions were according to the manufacturer' s 
instructions, except that DNA was fixed to the filters by 
UV Stratalinker treatment and/or baked at 80°C for 2 h. 
Radiolabelled DNA was prepared by random hexamer 
labelling. 

2 5 RFLP analysis. 

Two to three micrograms of plant genomic DNA was 
prepared from the parental plants used in the crosses and 
cleaved in a 300 pi volume. The digested DNA was ethanol 
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precipitated and separated on 0.7% agarose gels and 
blotted onto Hybond-N filters. Radiolabelled cosmid, 
lambda or YAC end probe DNA was hybridised to the filters 
to identify RFLPs. 

5 

RNA extractions 

RNA was extracted using a method described by Dean 
et al (1985) 

polyA RNA was isolated using the polyAtract R mRNA 
10 isolation system (Promega) . 

DNA extractions 

Arabidopsis DNA was performed by a CTAB extraction 
method described by Dean et al (1992) . 

15 

Isolation of cDNA by RT-PCR 

Total RNA was isolated from whole seedlings at the 
2-3 leaf stage growing under long days in the greenhouse. 
For first strand cDNA synthesis, 10 /xg of RNA in a volume 

20 of 10 /il was heated to 65°C for 3 minutes, and then 

quickly cooled on ice. 10 /il of reaction mix was made 
containing 1 /xl of RNAsin, 1 /il of standard dT17-adapter 
primer (1 fig/fil; Frohman et al, 1988) , 4/il of 5x reverse 
transcriptase buffer (250mM TrisHCl pH8.3, 375mM KC1, 

25 15mMMgCl2), 2/xl DTT (lOOmM) , 1/xl dNTP (20mM) , 1/xl 

reverse transcriptase (200 units, M-MLV Gibco) . This 
reaction mix was then added to the RNA creating a final 
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volume of 20/zl. The mixture was incubated at 42°C for 2 
hours and then diluted to 200 ^1 with water. 

lOjil of the diluted first strand synthesis reaction 
was added to 90^1 of PCR mix containing 4/xl 2 . 5mM dNTP, 
5 10^1 lOxPCR buffer (Boehringer plus Mg) , Ifil of a 

100ng//xl solution of each of the primers, 73.7/xl of water 
and 0.3/il of 5 units//xl Taq polymerase (Boehringer or 
Cetus Amplitaq) . The reaction was performed at 94°C for 1 
minute, 34 cycles of 55°C for 1 minute, 72°C for 2 minutes 
10 and then finally at 72°C for 10 minutes. 

DNA sequencing 

The Sanger method was used to sequence fragments of 
interest inserted in a Bluescript plasmid vector. 
15 Reactions were performed using a Sequenase kit (United 
States Biochemical Corporation) . 

Screening the Landsberg erecta cosmid library and the 
PRL-2 cDNA library. 

20 26000 clones arrayed in microtitre plates were 

screened by gridding offsets from 16 microtitre plates 
onto LB-tet (10^g/ml) plates and then taking colony lifts 
onto Hybond N filters. 1x106 plaques of the CD4-71-PRL2 
library (supplied by the Arabidopsis Biological Resource 

25 Center at Ohio State University) were screened by plating 
20 plates of 50000 plaques and then taking plaque lifts 
onto Hybond N filters. 
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Transformation of Arabidopsis 

The cosmids containing DNA from the vicinity of FCA 
were mobilised into Agrobacterium tumefaciens C58C1, and 
the T-DNA introduced into Arabidopsis plants as 
5 described by Valvekens et al, 1988. Roots of plants grown 
in vitro were isolated and grown on callus -inducing 
medium (Valvekens et al, 1988) for 2 days. The roots were 
then cut into short segments and co-cultivated with 
Agrobacterium tumefaciens carrying the plasmid of 

10 interest. The root explants were dried on blotting paper 
and placed onto callus -inducing medium for 2-3 days. The 
Agrobacterium were washed off, the roots dried and placed 
onto shoot inducing medium (Valvekens et al, 1988) 
containing vancomycin to kill the Agrobacterium and 

15 kanamycin to select for transformed plant cells . After 
approximately 6 weeks green calli on the roots start to 
produce shoots. These are removed and placed in petri 
dishes or magenta pots containing germination medium 
(Valvekens et al, 1988) . These plants produce seeds in 

20 the magenta pots. These are then sown on germination 
medium containing kanamycin to identify transformed 
seedlings containing the transgene (Valvekens et al, 
1988) . 



25 EXAMPLE 3 - PLANTS HOMOZYGOUS FOR THE T-DNA INSERTION 
CARRYING FCA FLOWER EARLIER THAN HETEROZYGOTES . 

Two transformants of each of the four cosmid clones 
that complemented the fca mutant phenotype were selfed 
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and seed of late and early flowering individuals were 
collected and plated on kanamycin- containing medium. All 
the late flowering progeny were kanamycin sensitive 
whilst progeny from the early flowering individuals were 
either homozygous or heterozygous for kanamycin 
resistance. This demonstrates that the kanamycin marker 
on the T-DNA carrying the region containing the FCA gene 
completely co-segregated with the early flowering 
phenotype . Thus, complementation to early flowering was 
due to sequences within the insert of the cosmid. LN was 
counted for the early flowering individuals either 
homozygous or heterozygous for the T-DNA insert. 



TABLE 1 
1J5 cosmid 

CL58I16 



20 



CL44B23 



cAtAl 



cAtA2 



K/K 
10.3 (9) 
9.7 (4) 

9.5 (2) 
12 (2) 

14.2 (5) 

9.6 (3) 
9.1 (7) 



K/- 
13 (4) 
10.4 (10) 
11.8 (6) 
11.1 (6) 
15 (3) 
10.8 (5) 
9.3 (3) 
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12.5 (3) 



14.4 (7) 



Analysis of flowering time (as measured by total LN) 
in transformants showing complementation of the fca 
mutant phenotype. For each cosmid two independent 
transformants were analysed. The leaf number was counted 
on F2 individuals (the number of which is shown in the 
bracket) which were then selfed and progeny sown on 
kanamycin- containing medium to establish whether the 
plant was homozygous (K/K) or heterozygous (K/-) for the 
T-DNA insert. 

The results, shown in Table 1 above, indicate that 
the homozygotes flowered significantly earlier than the 
heterozygotes in all 8 transformants analysed. Thus 
increasing the FCA gene dosage and therefore most likely 
the amount of gene product causes earlier flowering. 

EXAMPLE 4 - ANTISENSE EXPERIMENTS. 

A 1184bp BamHI (bp3547, Fig 3)/HindIII (bp4731 Fig 
3) restriction fragment from the FCA cDNA clone was 
subcloned into the BamHI /Hindi I I restriction sites of 
pBluescriptKSII . The insert was released with the enzymes 
BamHI and Xhol and subcloned into an Agrobacterium binary 
vector pSLJ6562 (J.Jones, Sainsbury Laboratory) . The 
resulting plasmid contains the CaMV 3 5S promoter 
transcribing the FCA cDNA fragment to produce antisense 
RNA, terminated with 3' sequences from the nopaline 
synthase gene. This plasmid also carries LB and RB 
Agrobacterium sequences for delivery into plant cells and 
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and a nos5 ' -kan-ocs3 ' fusion to allow kanamycin selection 
for transf ormants . The construct was introduced into 
Arabidopsis thaliana ecotype Landsberg erecta using the 
root explant transformation procedure of Valvekens et al 
(1988) . 

Selfed seed from five transf ormants were collected, 
sown on kanamycin- containing medium and and 10 kanamycin 
resistant individuals transplanted to soil. Three of the 
transf ormants segregated for a single T-DNA insertion, 
the other had two or more. Flowering time, assayed as 
rosette leaf number was measured. Progeny from four of 
the five transf ormants were late flowering, producing 12 
rosette leaves, compared to 4 for the fifth transf ormant . 
Grown alongside, in these particular conditions, non- 
transformed Landsberg erecta and fca-l plants flowered 
with ~4 and 11 rosette leaves respectively. Thus the 
antisense construct (as a single locus) effectively 
reproduced the late flowering phenotype of the fca-l 
mutation. 

EXAMPLE 5 - CONSTRUCTION OF PROMOTER FUSIONS TO THE FCA 
OPEN READING FRAME. 

A genomic Sall-Xhol fragment carrying the whole FCA 
gene plus 64 bp upstream of the putative start of 
translation and 500 bp downstream of the site of 
polyadenylation was cloned into the Xhol site of the 
Agrobacterium binary vector pSLJ 6562 (described above) . 
This resulted in a vector carrying a nos-kan fusion for 
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trans formant selection and a fusion where the 35S 
promoter is driving the FCA genomic region (21 exons, 20 
introns) . Tranformants have been made using this 
construct. 

This construct when introduced into fca-4 plants 
corrected the late flowering phenotype causing the plants 
to flower with 6.4 leaves under a long-day photoperiod. 
This was similar to wild-type Landsberg erecta which 
flowered with 6.2 leaves when grown alongside. 

EXAMPLE 6 - CONSTRUCTION OF AN FCA GENE LACKING INTRONS - 
TRANSCRIPTS y A AND y B . 

The 7 A construct was created by cloning together 
seven fragments : 

i. an EcoRI (a site present to the insert junction 
in the multiple cloning site of the vector) - Sail 
fragment from the cosmid CL43B23. This fragment contains 
the 5' promoter and untranslated region of FCA and the 5' 
region of the ORF. 

ii. a 425 bp Sall-Hindlll restriction fragment from 
cDNA clone 77B. 

iii. the region of the spliced transcript covering 
the 5' splice site of intron 3 was generated using RT-PCR 
with primers cDNAI I - BamHI and ianRTl. The product was 
reamplified using cDNAII-1 and RevEx4, digested with Sail 
and Bglll and cloned into pBluescriptKSII digested with 
Sail and BamHI. A 270 bp Hindlll fragment from this 
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plasmid was then used in the reconstruction of the fully- 
spliced transcript. 

iv. a region of the spliced transcript was amplified 
using RT-PCR and primers BamX and IanRTl . This was 
digested with Hindlll and Bglll and the 52 bp fragment 
used in the reconstruction of the fully spliced 
transcript . 

v. a region of the spliced transcript was amplified 
using RT-PCR and primers BamX and Rev404 (position 
indicated on Fig. 3) . A 256 bp Clal - BamHI fragment was 
released and gel-purified for use in the reconstruction 
of the fully spliced transcript. 

vi . a Clal-Spel fragment was excised from the FCA 
cDNA clone (the 1811 bp clone isolated from the PRL-2 
library) 

vii. a Spel-Xhol fragment, carrying the last ~14 0bp 
of 3' untranslated region plus -500 bp of 3' genomic 
sequence, was isolated from the FCA genomic clone. 

The seven fragments used to construct the FCA gene 
lacking introns were assembled in two parts, 5' region 
and then 3' region, which were then combined. 

A. 5' region. Fragment iv was cloned into 
pBluescriptKSII as a Hindlll/Clal insert. Fragment ii was 
then cloned into this as an EcoRI/Hindlll fragment (the 
EcoRI site coming from the mult i -cloning site in the cDNA 
cloning vector) . Fragment iii was then cloned into the 
Hindlll site between fragments ii and iv, the correct 
orientation being determined using an asymmetrically 
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positioned Rsal site. Fragment i was then cloned into the 
EcoRI/Sall sites. 

B. 3' region. Fragment vii was cloned into the 
Spel/Xhol sites present in fragment vi (the Xhol- site 
5 coming from the multiple cloning site in the vector) . 
Fragment v was then cloned into the BamHI site, the 
correct orientation being determined using an 
asymmetrically positioned Clal site. 

The 3' region containing fragments v, vi and vii was 
10 then cloned into the plasmid containing the 5' fragments 
as a Clal/Xhol fragment. 

The y B construct was generated by replacing the EcoNI 
fragment (1503 bp to 2521 bp of spliced transcript) with 
an EcoNI fragment from a clone derived from RT-PCR from 
15 Ler RNA that contained the alternatively spliced form 
encoding the full length protein. 

The resulting constructs were released from the 
vector using EcoRI and Xhol and cloned into the 
EcoRI/XhoI sites of the Agrobacterium binary vector 
20 pSLJ1714 (Jones et al 1992) . Transf ormants carrying this 
construct have been generated. 

Construct y A when introduced into Landsberg erecta 
caused it to flower with 5 . 6 leaves under a long-day 
photoperiod. This was slightly earlier than wild- type 
25 Landsberg erecta which flowered with 6.2 leaves when 

grown alongside. When grown under short -day photoperiod 
1/4 of the progeny from the tranformant flowered early 
(with an average of 8.7 leaves). This is significantly 
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earlier than wild-type Landsberg erecta which flowers 
with 23.5 leaves under these conditions. 



EXAMPLE 7 - EXPRESSION IN E.COLI. 
5 The 7 B construct, described in Example 6, was 

digested with Sail and Kpnl and cloned into the Xhol-Kpnl 
sites of the E. coli expression vector pRSETC (Invitrogen 
Corp.) . The resulting vector has the FCA cDNA cloned in 
frame with a polyhistidine metal binding domain, which 

10 enables the recombinant protein to be purified away from 
native E.coli proteins using a metal affinity resin 
(ProBond ™ Ni 2+ , Invitrogen Corp.) . The FCA protein did 
not bind well to the affinity columns and so was 
separated from the E.coli proteins by excision from an 

15 SDS-polyacrylamide gel. Protein was extracted from the 
gel slice and used to inject rabbits. A booster jab was 
given and then two bleeds taken. The antibodies produced 
detect the FCA protein dot blotted onto nylon membrane at 
>1/10,000 dilution. 

20 

EXAMPLE 8 - PRIMERS DESIGNED TO AMPLIFY GENES CONTAINING 
RRM DOMAINS WITH HIGH HOMOLOGY TO FCA. 

Based on the homology between etr-1, an EST derived 
from a human brain mRNA (dbest HI 995) ; the Drosophila 
25 sexlethal protein; the human nervous system proteins HuD, 
HuC, Hel-Nl, and Hel-N2; and the Xenopus proteins elrA, 
elrB, elrC, elrD a set of degenerate PCR primers were 
designed containing two regions of very high homology. 
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Amino 


acid 


F 


V 




G 


S 


L 


N 


K 


OLIGO 


1 5' 


TTT 


GTG 


GGG 


AGG 


CTG 


AAC 


AAG 






C 




A 


A 


TCA 


T A 


T 


A 






T 




T 


T 


T 












C 




C 


C 


C 








Amino 


acid 


R 


G 


C 


F 


V 


K 


Y 




OLIGO 


1 3' 


TCC 


GAC 


GCC 


. GAA 


GCA 


GTT 


TAT 










A 


A 


A 


A 


A 


C 










T 




T 




T 












C 




C 




C 







EXAMPLE 9 - CONSTRUCTION OF FCA DERIVATIVES TO GENERATE 
DOMINANT NEGATIVE MUTATIONS AND TO ANALYSE THE EXPRESSION 

15 AND SPLICING PATTERN OF THE FCA GENE. 

A construct expressing the second open reading frame 
of transcript a B under the control of the FCA promoter, 
was constructed by deleting the first open reading frame 
(from 450 bp to 1206 bp) . This was done using oligo 

20 mutagenesis to introduce a SphI site at the two 
positions, digesting and religating the vector. 

To examine FCA expression FCA promoter-GUS fusion 
constructs have been made. FCA promoter + exons 1-4 of 
FCA fused to the -glucuronidase (GUS) gene have been 

25 constructed to monitor the splicing within intron 3 . The 
entire FCA spliced cDNA (y B ) with GUS fused in frame at 
the C- terminus has been made to monitor FCA protein 
localization within the cell. 
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EXAMPLE 10 - IDENTIFICATION OF FCA HOMOLOGUES WITHIN THE 
ARABIDOPSIS GENOME. 

A four genome equivalent Landsberg erecta cosmid 
5 library was screened using low stringency conditions 

(40oC overnight, 1% SDS, 5 x SSC, 0.5% milk powder ) with 
the complete FCA genomic clone. The filters were washed 2 
x 20 min at 45°C in 2 x SSC, 0.5% SDS. After exposure they 
were then rewashed 2 x 20 min, 50°C in 2 x SSC, 0.5% SDS. 
10 61 cosmid clones were picked, plus two negative control 
cosmids. Five of these were additional FCA clones, 
leaving 56 putative FCA homologues. Minipreps were 
prepared from 10 ml o/n cultures Of cosmids, digested 
with EcoRI, run on 0.8% gels with positive and negative 
15 controls on each gel and Southern blotted. The blots were 
hybridised separately to 77B and FCA cDNA (originally 
called 61A) (Fig. 7) using the conditions described above 
and then washed at 4 5°C only. 

Of the putative homologues : - 
2 0 (a) - 2 cosmids hybridized only to 77B 

(b) - 11 cosmids hybridized only to 61A 

(c) - 31 cosmids hybridized to both cDNAs 

(d) - 13 cosmids difficult to score or showed no 
detectable hybridized 

25 

(a) 2 cosmids appear not to be related 

(b) - 49 C 22 and 67 I 3 share common EcoRI 
fragments 
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- 18 G 16 and 7 L 2 

(c) - 39 G 10, 46 H 15, 56 F 2 and 59 A 8 share 
common EcoRI fragments, 

- 39 G 10 and 56 F 2 share additional frag 

- 4 H 4 and 45 K 24 share two frags 

- at least nine other pairs of cosmids may have 
at least one EcoRI fragment in common. 



Table 2 



Primers 


Sequence 


bp start 
Figure 3 


cDNAI I -BamHI 


5' CAGGATCCTTCATCATCTTCGATACTCG 3' 


25 


cDNAII-1 


5' GTCCCTCAGATTCACGCTTC 3' 


228 


cDNAII-3'a 


5' CACTTTTCAAACACATC 3' 


1167 


cDNAII-3'b 


5' GTTCTCTGTACATTAACTC 3' 


1213 


BamX 


5' ATTGAGATTCTTACATACTG 3' 


2568 


RevExlA 


5' TAAGACATGTCTGACAG 3' 


2838 


RevExlB 


5' GTGATCTGATTGTGCAG 3' 


3030 


RevEx4 


5' TAGACATCTTCCACATG 3' 


3145 


IanRTl 


5 ' CAATGGCTGATTGCAACCTCTC 


3320 


IanRT2 


5' TCTTTGGCTCAGCAAACCG 3' 


3348 


Rev4 04 


5' CAATGTGGCAGAAGATG 3' 


3673 


fca-3'a 


5 ' AGGCCATTGTTTGGCAGCTC 


4941 


fca-3'b 


5' CCCAGCTAAGTTACTACTAG 3' 


5003 
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CLAIMS : 

1. A nucleic acid isolate comprising a nucleotide 
sequence coding for a polypeptide comprising the amino 
acid sequence shown in Figure 2 . 

5 

2. Nucleic acid according to claim 1 wherein the coding 
sequence comprises a sequence shown as an exon in Figure 
1. 

10 3 . Nucleic acid according to claim 2 wherein the coding 
sequence comprises the sequences shown as exons in 
Figure 1. 

4 . Nucleic acid according to claim 1 wherein the coding 
15 sequence is a mutant, allele or variant of the coding 

sequence of Figure 1. 

5 . Nucleic acid according to any of claims 1 to 3 
comprising an intron. 

20 

6. Nucleic acid according to claim 5 comprising an 
intron as shown in Figure 1 . 

7. Nucleic acid according to claim 6 wherein said 
25 intron is intron 3 of Figure 1. 

8 . A nucleic acid isolate comprising a nucleotide 
sequence coding for a polypeptide comprising an amino 
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acid sequence mutant, allele or variant of the FCA amino 
acid sequence of the species Arabidopsis tha.lia.na shown 
in Figure 2, by way of insertion, deletion, addition or 
substitution of one or more amino acids, or a homologue 
5 from another species, wherein expression of said nucleic 
acid in a transgenic plant influences a flowering 
characteristic of said plant. 

9. Nucleic acid according to claim 8 wherein said 
10 flowering characteristic is the timing of flowering. 



15 



10. Nucleic acid according to claim 9 wherein said 
mutant, allele or variant has the ability to advance 
flowering in a plant. 

11. Nucleic acid according to claim 9 wherein said 
mutant, allele or variant has the ability to delay 
flowering in a plant. 



20 12. Nucleic acid according to any of claims 8 to 11 
comprising an intron. 

13. Nucleic acid according to claim 12 comprising an 
intron as shown in Figure 1 . 



25 



14. Nucleic acid according to claim 13 wherein said 
intron is intron 3 of Figure 1 . 
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15. Nucleic acid according to claim 14 comprising the 
nucleotide sequence of FCA a B , i.e. that of Figure 3. 

16. Nucleic acid according to claim 8 that has "the 

5 nucleotide sequence of FCA a A , i.e. intron 3 of Figure 1 
and all the exons of Figure 1 except for the exon 
nucleotides indicated in Figure 1 to be within the 
alternative intron splicing sites around intron 11. 

10 17. Nucleic acid according to claim 8 that has the 

nucleotide sequence of FCA y A , i.e. all the exons of 
Figure 1 except for the exon nucleotides indicated in 
Figure 1 to be within the alternative intron splicing 
sites around intron 11. 

15 

18. Nucleic acid according to claim 8 wherein said 
species other than Arabidopsis thaliana is a Brassica. 

19. Nucleic acid according to claim 18 wherein said 
20 homologue comprises the amino acid sequence shown in 

Figure 8b. 

20. Nucleic acid according to claim 19 comprising the 
coding sequence shown in Figure 8a. . 

25 

21. Nucleic acid according to claim 19 wherein the 
coding sequence is a mutant, allele or variant of the 
coding sequence of Figure 8a. 
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22. A nucleic acid isolate comprising a nucleotide 
sequence coding for a polypeptide comprising an amino 
acid sequence mutant, allele or variant of the amino 
acid sequence encoded by the nucleic acid of claim 19, 
by way of insertion, deletion, addition or substitution 
of one or more amino acids, which mutant, allele or 
variant has at least 8 0% amino acid identity with the 
sequence of Figure 8b and ability to influence a 
flowering characteristic of a plant. 

23 . Nucleic acid according to any of claims 1 to 22 
further comprising a regulatory sequence for expression 
of said polypeptide. 

15 24. Nucleic acid according to claim 23 comprising an 
inducible promoter. 

25. A nucleic acid isolate comprising a nucleotide 
sequence complementary to a coding sequence of any of 

20 claims 1 to 22, or a fragment of a said coding sequence 
suitable for use in anti-sense regulation of expression. 

26. Nucleic acid according to claim 25 which is DNA and 
wherein said nucleotide sequence complementary to a said 

25 . coding sequence or a fragment thereof is under control 
of a regulatory sequence for anti- sense transcription. 



5 



10 
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27. Nucleic acid according to claim 26 comprising an 
inducible promoter. 

28. A nucleic acid vector suitable for transformation of 
5 a plant cell and comprising nucleic acid according to 

any preceding claim. 

29. A host cell containing heterologous nucleic acid 
according to any preceding claim. 

10 

30. A host cell according to claim 29 which is 
bacterial . 

31. A host cell according to claim 29 which is a plant 
15 cell. 

32. A plant cell according to claim 31 having said 
heterologous nucleic acid within its genome. 

20 33. A plant cell according to claim 32 having more than 
one said nucleotide sequence per haploid genome. 

34. A plant comprising a plant cell according to any of 
claims 31 to 33. 

25 

35. Selfed or hybrid progeny or a descendant of a plant 
according to claim 34, or any part or propagule of such 
a plant, progeny or descendant, such as seed. 
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36. A method of influencing a flowering characteristic 
of a plant, the method comprising causing or allowing 
expression of the polypeptide encoded by heterologous 
nucleic acid according to any of claims 1 to 24 within 

5 cells of the plant. 

37. A method of influencing a flowering characteristic 
of a plant, the method comprising causing or allowing 
transcription from heterologous nucleic acid according 

0 to any of claims 1 to 24 within cells of the plant. 

38. A method of influencing a flowering characteristic 
of a plant, the method comprising causing or allowing 
anti-sense transcription from nucleic acid according to 

5 any of claims 25 to 27 within cells of the plant. 

39. Use of nucleic acid according to any of claims 1 to 
24 in the production of a transgenic plant. 

) 40. Use of nucleic acid according to any of claims 25 to 
27 in the production of a transgenic plant. 
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Figure 1 

1 gatctaggtg aaattaatct gaagtttaga aatagatttt cttggaactt 
51 cggagaaaat atgcttcact caactttttt ttggtgctat atgaacaaag 
101 ataatggtca tatgaatgta aacgtgtttt gggatgatgt fcatettgttc 
1S1 catagatgcg gttggaagaa ttgcatttgg actgcaaaaa ctgatggcct 
201 ttatctttgg aattcagcga gcggtgaaga tgtattgtca agaaaatggg 
251 aagttggatg gtgaagccat atttttgctt ttgggtaatt ttttagtaca 
301 tgtatcttgt tgtttttggc aaaaaaaaaa ttgaaataat aaaaaacatt 
351 tgttttaact ttctctctta ttttgtgtat ttttcatcaa tgatagattt 
401 tttgttttag-ttctttattt ataggtcatt taattattag- attaatttcc 
451 tgagataata agatcataga ttaaataaca atattgtgtt tgtgatatat 
501 agagattaca ttttacactt atatatagtg gtaagatttc tttttgcttt 
551 caaaccatta aaaacctgtb aaacgattaa cttgactcaa gacaaagcca 
601 ttgattattg actcatgaat ctagtgactc atgatgaagg agacgaacag 
651 taaatattca tttgattatt ttaggtaaaa ggtagttcag- acctagtcat 
701 atatcctcta aattcatata gtgatgcaag tattttgcat tacttagaac 
751 tttatattat tgatcaccca acacatgatt taataaacgc catgaaatgc 
801 atgtactata tcaaaatgtt tctgaagcat atagttgaca tgagaatttt 
851 ggattggact taagaatgtg agagttacct gaaatgtcaa tttttttccc 
901 tttgttaacg aaaactcatt ggaacaattg tatccccctt ttggcagtat 
951 ataaatatat tgatggccca agtagctgta ttttccgtta tcagccaaga 
1001 ctcaataaag tctaccggtc caaatttcaa ctgaatcacc ggtccaacca 
1051 ctattiaccgt aactagaccg ctttttctbt tttacattcg gacaaaaaaa 
1101 tcaaaatutc gagcaactaa attgatctca tcttcaatcA AATTCATCAT 
1151 CTTCGATACT CGTTTCTTCT CTCTTTGGTT TCATACAGAT CCCAAATTTC 
1201 TAGGGCTCCT AGTCCTTTGA TTTCTTCGAC TGGAATCGCA ATTCCCCACT 
1251. ACGTCAAGCT GGACAGACAC CGAAGGGATC GCCATGAGAG TGGCGGCTAC 
1301 GAGGATTCCT ACCATAACCA CGGAGCCCAT CCCAGAGGTC CATCTCGTCC 
1351 CTCAGATTCA CGCTTCGAAG AGGATGATGA TCATTTTCGC CGCCACCGTC 
1401 GTCGTCGTGG AAGCAGCCCT AGC AATTATC GAATTGGAAT TGGGGGCGGA 
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Figure 1 Continued 
1451 GGAGGAGGTA ATGGTGGTCG 
1501 TGATGGTCCC GGAGATGGAG 
1551 GAGTAGATTT TAAGCCT ^TG ( 
1601 Ia^^ggtttg CCTACGATGA 
1651 TGTGGGAGGA GAAGGGACAC 
1701 CCGCGAAGTA TCCTCCTTCA 
1751 AAAGCAATGG AGTCTGATTA 
1801 GCAGCCTCTT TCCGGTCAGA 
1851 GCTTTACTGG AACTGgtaag 
1901 atttattctt gtagtctgtt 
1951 gaatcaatga ttagagtatt 
2001 ttgttgaagg tttttcatgg 
2051 ccatgtatgt gataatcaaa 
2101 ctggttaatt tgatttgcag 
' 2151 TTTTTGTTGG ATCTGTACCA 
2201 tcttggaaat cattgttatc 
2251 attttttctg ttggttttca 
2301 ATGGAAATGT TCTGGAGGTT 
2351 CAGCAAGgta tgtcaatctc 
2401 ctttttaaaa tttcaggtct 
2451 gtctcatcat tggcctccaa 
2501 tatgcttgcc agcgtcttiat 
2551 acctgtigcta agaaagggtti 
2601 ctact-ligtgL gtattctaga 
2651 LattgagggL gLLHUagagu 
2701 cLgLLqccLL LauagUggga 
27 51 LccuatLgaa aitcaLLLLc 
2001 tug eg La egg HcuLgacagg 
2851 auaaggacati gaygLicaaa 
2901 gtcaaagatic Lggccgccuu 
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ACGCTGGGAA GATGACACTC CTAAGGATTT 
G TTTCCGGC A GjAT^AATGGT CCCCCAGATA 
GGTCCTCACC ATGGTGGAAG TTTTCGGCCT 
TGGTTTTCGT CCAATGGGTC CTAACGGTGG 
GGTCAATTGT TGGAGCTCGG TATAACTATC 
GAGAGTCCAG ACAGGAGGAG ATTTATCGGT 
TTCTGTAAGA CCGACTACAC CGCCGGTCCA " 
AAAGAGGGTA TCCTATCTCA GACCATGGCA 
catgagttca c.tcttctttc ttctatgtat 
aaggttcctg agtgtctctt atttttgtgg 
gaaaggtagt atggttgtta tgttactgta 
gatcgactct agaggatcct ttcgattttc 
actatatgee atcttcatgt gtatccttat 
ATGTCTCTGA TCGTAGCAGT ACAGTCAAGC 
AGGACAGCTA CAGAAGAAGA Agtgagttaa 
tatatactca ttactgagaa ccttttctaa 
tattgtagAT CCGTCCCTAT TTCGAACAGC 
GCTCTGATCA AGGACAAGAG AACTGGACAG 
cattttatta ggaaatagtc gtgaattata 
ccctgaaaag gctgatggga agcaacccca 
ttgtttgcaa caattttegg gettattget 
ctgtgttcga ttctgtcaca gaagaaggct 
tatgtactta tgttgggcaa atagatttcg 
actutagatg tgtttgaaaa gtgtagaatzt 
tggagl:taai: gtacagagaa ctgaatitLtg 
attggLtata agaacaticgc uattutcctc 
ULiiactcULc cLcuagaugg auugaagalig 
aLgaalignat tHULtuaagu ^ggtiagutiug 
agauggLLiic utigamugcc actccugctg 
ucUaauucLa ccaugcugga ggucuggcgti 
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Figure 1 Continued 

2951 cutcaimuuc uuccaiiatca atmtacgggt gtgccgucua tcggecaaug 
3001 atggcaULcc UtUtaccttt ttggatgagt gatgctggaa tgaatgcgtt 
3051 tctccttttc ttttgttgat ggcctgagga actatgatgg ctatatttct 
3101 ttccactctc ttitgaatggc ctgaaatgtg tgctttctgt atggtcgtcc 
3151 ctctcaatstt cttggatggc ttgtgatgtg atataccatc tctcgtcata 
3201 ggtgaatgaa tgatttgttt agtagttctt atgtatgtat tttgtatgtt 
3251 cccacgtctc tattccttgg atggcttgtg atgtgatata ccatctctcg 
3301 tcatagatga atgaatattt tgttgagtag ctcttatgtc tgtatggtgg 
3351 cccttgcagt gctgatcgat atttatgtgg aagaaatgtt tgatgataga 
3401 tbttttttgt atgctccctt ttcgctaatc aagcctttgt gcttgcaagg 
3451 tgcaactgtt attttatta-t tgaatttcct gttctactac tccatttagt 
3501 tctgtctcta ttttgtcagt gtgaagaaat actagacgat gaatggtgtg 
3551 tttgtacgtg catagttatt tataaattct tgactttcca agaagttatt 
3601 atttctataa ctgctacacc tttgfcggatg gcagaacaaa tgcatctgat 
3651 tgtggtgaca taaacacttt tgatcgcggt tgaatgtact agattccata 
3701 caactctttc ttcagccttg tgaaatatta ttatgttagg tggtgcaaac 
3751 atatggaagg aacctgattg ttttagtttc ttagaatagt ttctgatgtt 
3801 aatacagcat gttgacttca ctctcttgcc cttgatcaat cagcatcagg 
3851 caggggccta attatgtatfc acatgaagca atcgtattct tttctgaatt 
3901 agattttttt ccaatgagtt atcttgccca taactgtagt "tctttatttg 
3951 aagtcttcaa atgcttgatg tatggtgacg aaaatgtgta tatgttttgg 
4001 ttttgattat ccgctactca tcaattattg agattctitac atactgaatc 
4051 cgttactttg gacctatagt tatgttttat gttgctaatt aacttgtaca 
4101 tgtttctaga ttutctttca aatggatcct gclitggacaa atgcagccac 
4151 ccttitgtctg aaaggccctc ttgtagatat gtiiavictgca gatactgact 
4201 gtgttcaatt mttaatat t tgttuttgcc atacccecca tuugaagaca 
4251 ttaatttatu ctctccaaca actttacatc aatauutaag Lggaggctgt 
4301 caqacaliglc ttatgat tut cctactgaac ULaucjlicjcLV: .tgagtiagtiac 
4351. aiictugttac Uagtiacaa c L tgatggtaga aggaaaagr.u gaaccciigaa 
4401 acagauagci: taagtatcag tcuncaatgc agGCiXVrTGT TTTCTAAAAT 
4451 ATGCAACTTC GAAAGATGCG G AT AG AGCC A TCAGAGCACT GCACAATCAG 
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4501 ATC AC TCTTC CTGGGgtaat taccctgagg ctttctctta tcaagaacag 

4551 gaaactatag gttgtttcac cttttataat tttgttgatt cccagGGAAC 

4601 TGGTCCTGTT CAAGTTCGAT ATGCTGACGG GGAGAGAGAA CGCATAGgta 

4651 atcaacttcc acacagagta tctaatgtgg ctgtcattgt ctagfcgttca 

4701 tagccaagac catacgctgc ataagttcag attacaaaaa ttaagaaaat 

4751 gtgggaaatg atatgaactt tatggatgtt gatcctfcttc tttccctgtt 

4801 ttctttgcct tactatcaag tgatatagtt ctcttcttct gaagGCACCC 

4851 TAGAGTTTAA GCTTTTTGTT GGTTCACTAA ACAAGCAAGC CACTGAAAAA 

4901 GAAGTTGAGG AGgtatgttt cgtatcttac tttttgaagt tgttacttat 

4 

4951 gtcagattaa cggaacaggg aagagttcta aacttggata ttattgtgtc 
5001 ccctgttacc tgagttgata attttaaatg actctttgat aaattttgtt 
5051 agtcttacca aagggtgagfc gtctagaaaa fcctgtgtcaa taatgcaagc 
5101 gcttggacat tctacttact gtgtaatctc ttcttccaat tgatccaact 
5151 gtttgactgt cataatagat aaaattaata aatgtgaacg gctaccttcc 
5201 cagttcaact tatgtgtttc aatttctcat gtaatctttt aacaaactgt 
5251 tttattgtta ttgctttaac agATCTTTTT ACAATTTGGT CATGTGGAAG 
5301 ATGTCTATCT CATGCGGGAT G AAT AT AG AC AGAGTCGTGg tatgttttgt 
5351 aatttgtact agattctata aattatttgt tgtgtgatga tgtJtgagatg 
5401 gtgaaactgt gtttttcact ttgtagGATG TGGGTTTGTT AAATATTCAA 
5451 GCAAAGAGAC GGCAATGGCA GCTATCGATG GTCTCAACGG AACTTATACC 
5501 ATGAGAGtaa gctgbgaatc acataagtat ctcagtttct ■ ctcattatca 
5551 ccctttggac ctgttttgtt tactggcctc tatcctttcc ccagGGTTGC 
5601 AATCAGCCAT TGATTGTTCG GTTTGCTG AG CCAAAGAGGC CTAAACCTGG 
5651 CGAGTCAAGg taatgccttg ggtactatal: tttgattaat cctaatactc 
5701 ttatcaagta aalilgtatat: accttcattc tttglitcegt cLgagttata 
5751 tttgtggaga atcttutgga catggtggag agttgggaac cctgULcctl: 
5801 ctccagttat Lactggaa Ug tgaagcattg ctttctagat atccttaagl: 
5851 agtttcugiiu UCCagGG AAA TGGCACCTCC TGTTGG ACTT GGTTCAGGGC 
5901 CTCGTTTTCA AG C'LTCAGGA CCAAGgtiaac tggtgtgaaa ggagaLcatg 
5951 attatgctca ctaggcaatc atatatgtt-g acttiaccccg gucticcticat: 
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Figure 1 Continued 

6001 ctctattcgu eagGCCTACC TCTAACTTTG GTCACTCTAC TCGGGATGTA 
6051 AGCCACACAA ATCCTTGGCG TCCAGCTACT TCACGAAACG TAGGCCCACC 
6101 TAGTAACACT GGGATCCGTG GTGCCGGTAG TGACTTTCCC CCTAAACCAG 
6151 gtcaagcaac attzgccttca aatcaggtga gaacaggttg atgatcatgt 
6201 atatcatctt aaatctgcac attcatataa gtaagcgcat agagtttgca 
6251 tgtattgtgc gagacaaata aaaagaaagt acttcabata ctgcacacat 
6301 gggcttatga caggtgaaaa gaagcatgaa gttctgacct ttcaactttt 
6351 catataatgc aacaaacacg atgtgtgttg ctcaaatgat atggccttaa 
6401 tttgcagttt gtcagttact gaggcaattt tttttttgaa taatttctag 
6451 ccctgatgtg agctttttta aatgtaacat tctatattgt tagGGTGGCC 
6501 CGTTAGGTGG TTATGGTGTT CCTCCCCTTA ACCCTCTCCC AGTCCCTGGA 
6551 GTTTCATCTT CTGCCACATT GCAACAGgta ctttagctat atttttccaa 
66G1 ttaagcaaat ctgaaaatgt tgbgatgatt aacttggatt ttcaattgtt 
6651 tctattccat agCAAAATCG GGCAGCTGGC CAGCATATAA CACCATTAAA 
6701 AAAACCTCTT CACAGTCCAC AGGGTCTCCC TCTCCCCCTC CGTCCGCAAA 
6751 CTAATTTCCC TGGGGCCCAG GCACCCTTGC AGAATCCTTA TGCTTATAGC 
6801 AGCCAGTTGC CTACCTCTCA GCTGCCACCA CAGCAAAACA TCAGTCGTGC 
6851 AACTGCTCCT CAAACTCCTT TG AAC ATT AA TCTACGGCCA ACAACTGTGT 
6901 CTTCTGCAAC TGTTCAATTT. . CCCCCTCGTT CCCAGCAGCA ACCGCT&CAA 
6951 AAGATGCAAC ATCCTCCTTC TGAGCTAGCT CAGCTCTTGT CGCAGCAAAC 
7001 TCAGAGTCTA CAAGCAACAT TCC AATCGTC TCAGCAAGCA ATTTCTCAGC 
7051 TGCAcjcAGCA GGTGCAGTCT ATGCAGCAAC CAAACCAAAA TTTACCACTC 
7101 TCACAGAATG GCCGAGCTGG TAAACAACAG gtatgaataL aglictctcag 
7151 ttigcatcUgc ccagacgggt tcLLcagctg ctattgtgtt gbtttaacbt 
7201 aaaacbacuu cctgatagac atcccgbtti: ttatccttca ugtgttttag 
7251 ' tautctcccc u u mcliaatig LLcctcUcgg clzgcctctu; atcagTGGGC 
7301 TGGATGTGCA ATTCCAAG AG TGGCTAGCAC CACTCGTTCC ACACCAGTGA 

7 3 5 3. gctatctcca aacagctgca cctx: ( CAG[rAA gtcagagcgt AccrrcTGTC 

7401 AAATCTACCT GGACCGAGCA TACCTCCCCT CATCGATTTA AATATPATTA 

'M51 CAATGGTCTA ACGGGTGAAA GCAAGcjtgag eaacgtggcu ccLcuttiaat: 
7501 aUaCLiiccLU gcgagcuuca ggaguauncc ucci:ggtu»:e tUgtigctiact: 
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Figure 1 Continued 

7551 gataaiicctt: acacatigtiat aLizt tiatiaLt tgaagtcctl: cagtiacgtgc 
7 601 catattatgt atataattca cttttgcagT GGGAAAAACC TGAGGAAATG 
7 651 ATAGTGTTCG AACGAGAGCA AC AG AAAC AG CAACAACATC AAG AG AAGCC 
7701 AACTATACAG CAGTCCCAGA CCCAATTACA GCCGTTGCAG CAACAACCAC 
7751 AACAAGTTCA GCAGCAATAT CAGGGCCAGC AATTACAGCA GCCGTTTTAT 
7801 TCTTCACTGg ttggttztcgt tttcatgctg gbtacattca aatatttttg 
7 851 tcacafcggtt tctaatttgc atatttactc ttgttcattt ggagttgcag 
7901 tatccaactc caggggccag ccataatact. caggtgtata tctgtttaat ' 
7951 ctgtttactt atttttcatt tcaagatttg attcttgata tgctaatctt 
8001 gtggtagaag gagattgacc accttaaagt aaaattcagt agccatggtt 
8051 ttgccagcat tttgaaatac agataacaaa tctctaacgt gaatgcctat 
8101 tttcctttct aaaatgcagT ATCCATCATT GCCAGTAGGT CAAAATAGCC 
8151 AGgtacatat ctgaatctgt ggacttattt ttcattgaac fcgattgattc 

82 01 tcagttacaa cattgacttc ctctgatgcg tagtttttgt aacatatcag 
* 8251 aafcaacaaaa acttcatctg attcgtatat tctctggttg aaaatctttt 

83 01 tttcttttct ggaaaatgca gTTTCCTATG TCAGGAATTG GTCAGAATGC 
8351 TCAGgtatat atctcatttt gtattaacaa tttcccatac cttctgtacc 
8401 tttgaattta atcacagaac ataatgagtt cttggattta atgtcatttt 
8451 aaaaagaaac atcagtgata tgacttcctt ccttggttaa aaatggttta '* 
8501 ggcagagctt attttctatt ctgtttggat tgtctagGAT TATGCTCGGA 
8551 C AC AT AT ACC CGTGGGAGCT GCTTCAATGA ATGATATATC AAG AACTCC A 
8601 CAGgtagtta tggtttttat cagtgattca gaacttctct ctgttcataa 
8651 ttcgtccttt ggtattcaga tgttcttttt cgttgaaacc gttuttttcc 
8701 ttaattctct LtacaaccaL atctcttttt cccagAGCCG TCAATCTCCC 
07 51 CAAGAACTCA TGTGGAAGAA TAAAGCTTGA Ggttcatatc taccctttct 
0801 cliccLctctc ttgi:ai:tiu.uc uccatiaccga aacacalitcc aaUgLatgtg 
0051 gLUUcttLag LLgaagLiac cLcUgLgtlug aticgatiactzc tacttcagGT 
0901 AC ATG AG ACG AGGAGCTAAA CTATCTC AGT AG C T AG AT AG AAATTTCTGG 
0951 AACTAATTAG TC AAGG AG AG G AAAAGCAGC AATG G TAG TG TCCTTAGTCT 
9 001 C TG ATTTTTT TAGTTAACCC CTTCAGTTAT AAT AG AT AGG CGATCGTAGA 
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Figure 2 
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Figure 3. 

1 AAATTGATCT CATCTTCAAT CAAATTCATC ATCTTCGATA CTCGTTTCTT 
51 CTCTCTTTGG T T T CAT AC AG ATCCCAAATT TCTAGGGCTC CTAGTCCTTT 
101 GATTTCTTCG ACT GG AATCG CAATTCCCCA CTACGTCAAG CTGGACAGAC 
151 ACCGAAGGGA TCGCCATGAG AGTGGCGGCT ACGAGGATTC CTACCATAAC 
201 CACCGAGCCC ATCCCAGAGG TCCATCTCGT CCCTCAGATT CACGCTTCGA 
251 AGAGGATGAT GATGATTTTC GCCGCCACCG TCGTCGTCGT GGAAGCAGCC 
301 CTAGCAATTA TCGAATTGGA ATTGGGGGCG GAGGAGGAGG TAATGGTGGT 
351 CGACGCTGGG AAGATGACAC TCCTAAGGAT TTTGATGGTC CCGGAGATGG 
401 AGGTTTCCGG CAGATGAATG GTCCCCCAGA TAGAGTAGAT TTTAAGCCTA 
451 TGGGTCCTCA CC AT GGTGG A AGTTTTCGGC CTATGGGGTT TGCCTACGAT 
501 GATGGTTTTC GTCCAATGGG TCCTAACGGT GGTGTGGGAG GAGAAGGGAC 
551 ACGGTCAATT GTTGGAGCTC GGTATAACTA TCCCGCGAAG TATCCTCCTT 
601 CAGAGAGTCC AGACAGGAGG AG AT T TAT CG GTAAAGCAAT GGAGTCTGAT 
651 TATTCTGTAA G AC C G AC T AC ACCGCCGGTC CAGCAGCCTC TTTCCGGTCA 
701 GAAAAGAGGG TATCCTATCT CAGACCATGG CAGCTTTACT GGAACTGATG 
751 TCTCTGATCG TAGCAGTACA GTCAAGCTTT TTGTTGGATC TGTACCAAGG 
801 ACAGCTACAG AAGAAGAAAT CCGTCCCTAT TTCGAACAGC ATGGAAATGT 
851 TCTGGAGGTT GCTCTGATCA AGGACAAGAG AACTGGACAG CAGCAAGGTA 
901 TGTCAATCTC CATTTTATTA GGAAATAGTC GTGAATTATA CTTTTTAAAA 
951 TTTCAGGTCT CCCTGAAAAG GCTGATGGGA AGCAACCCCA GTCTCATCAT 
1001 TGGCCTCCAA TTGTTTGCAA CAATTTTCGG GCTTATTGCT TATGCTTGCC 
1051 AGCGTCTTAT CTGTGTTCGA TTCTGTCACA GAAGAAGGCT ACCTGTGCTA 
1101 AGAAAGGGTT TATGTACTTA TGTTGGGCAA ATAGATTTCG CTACTTGTGT 
1151 GTATTCTAGA ACT TTAG ATG TGTTTGAAAA GTGT AGAATT TATTGAGGGT 
1201 GTTTTAGAGT TGGAGTTAAT GTACAGAGAA CTGAATTTTG CTGTTGCCTT 
1251 TATAGTGGGA ATTGGTTATA AGAACATCGC TATTTTCCTC TCCCTATTGA 
1301 AATTCATTTT CTTTACTCTT CCTCTAGATG GATTGAAGAT GTTGTGTATG 
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Figure 3 Continued 

1351 GTCTTGACAG GATGAATGTA TTTTTTTAAG TTGGTAGTTT GATAAGGACA 
1401 TGAGGTTCAA AAGATGGTTT CTTGATTTGC CACTCCTGCT GGTCAAAGAT 
14S1 TTGGCCGTCT TTCTAATTTT ATCATGTTGG AGGTTTGGCG TCTTCATTTT 
1501 CTTTCATATC AATTTATGGG TGTGCTGTCT ATTGGTTAAT GATGGCATTC 
1551 CTTTTACCTT TTTGGATGAG TGATGCTGGA ATGAATGCGT TTCTCCTTTT 
1601 CTTTTGTTGA TGGCCTGAGG AACTATGATG GCTATATTTC TTTGCACTCT 
1651 CTTTGAATGG CCTGAAATGT GTGCTTTCTG TATGGTCGTC CCTCTCAATT- 
1701 TCTTGGATGG CTTGTGATGT GATATACCAT CTCTCGTCAT AGGTGAATGA 
1751 ATGATTTGTT TAGTAGTTCT TATGTATGTA TTTTGTATGT TCCCACGTCT 
1801 CTATTCCTTG GATGGCTTGT GATGTGATAT ACCATCTCTC GTCATAGATG 
1851 AATGAATATT TTGTTGAGTA GCTCTTATGT CTGTATGGTG . GCCCTTGCAG 
1901 TGCTGATCGA TATTTATGTG GAAGAAATGT TTGATGATAG ATTTTTTTTG 
1951 TATGCTCCCT TTTCGCTAAT CAAGCCTTTG TGCTTGCAAG GTGCAACTGT 
2001 TATTTTATTA TTGAATTTCC TGTTCTACTA CTCCATTTAG TTCTGTCTCT 
2051 ATTTTGTCAG TGTGAAGAAA TACTAGACGA TGAATGGTGT GTTTGTACGT 
2101 GCATAGTTAT TTATAAATTC TTGACTTTCC AAGAAGTTAT TATTTCTATA 
2151 ACTGCTACAC CTTTGTGGAT GGCAGAACAA ATGCATCTGA TTGTGGTGAC 
2201 ATAAACACTT TTGATCGCGG TTGAATGTAC TAGATTCCAT ACAACTCTTT 
2251 CTTCAGCCTT GTGAAATATT AT T AT GT TAG GTGGTGCAAA CATATGGAAG 
2301 GAACCTGATT GTTTTAGTTT CTTAGAATAG TTTCTGATGT TAATACAGCA 
2351 TGTTGACTTC ACTCTCTTGC CCTTGATCAA TCAGCATCAG GCAGGGGCCT 
2401 AATTATGTAT TACATGAAGC AATCGTATTC TTTTCTGAAT TAGATTTTTT 
24 51 TCCAATGAGT 'L'ATCTTGCCC AT AACTGTAG TTCTTTATTT GAAGTCTTCA 
2501 AATGCTTGAT GTATGGTGAC GAAAATGTGT AT ATGTTTTG GTTTTGATTA 
2 55 L TCCGCTACTC ATCAATTATT GAGATTCTTA C A T A C T G A AT CCGTTACTTT 
2601 GGACCTATAG TTATGTTTT A TG'L'TGCT AAT TAACTTGTAC ATGTTTCTAG 

2 651 ATT'l'TCTTTC aaatggatcc tgcttggaca aatgcagcca CCCTTTGTCT 
2701 gaaaggccct cttgtagata tgttatctgc agatactgac tgtgttcaat 

2-7 51 TTTTTAATAT TTGTTTTTGC CATATTCTCC ATTTGAAGAC ATTAATTTAT 
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Figure 3 Continued 

2801 TCTCTCCAAC AACTTTACAT CAATATTTAA GTGGAGGCTG TCAGACATGT 
2851 CTTATGATTT TCCTACTGAA CTTATGTGCT TTGAGTAGTA -CATCTTGTTA 
2901 CTAGTACAAT TTGATGGTAG AAGGAAAAGT TGAACCCTGA AACAGATAGC 
2951 TTAAGTATCA GTCTTTAATG CAGGCTGTTG TTTTGTAAAA TATGCAACTT 
3001 CGAAAGATGC GGATAGAGCC ATCAGAGCAC TGCACAATCA GATCACTCTT 
3051 CCTGGGGGAA CTGGTCCTGT TCAAGTTCGA TATGCTGACG GGGAGAGAGA 
3101 ACGCATAGGC ACCCTAGAGT TTAAGCTTTT TGTTGGTTCA CTAAACAAGC 
3151 AAGCCACTGA AAAAGAAGTT GAGGAGATCT TTTTACAATT TGGTCATGTG 
3201 GAAGATGTCT ATCTCATGCG GGATGAATAT AGACAGAGTC GTGGATGTGG 
3251 GTTTGTTAAA TATTCAAGCA AAGAGACGGC AATGGCAGCT ATCGATGGTC 
3301 TCAACGGAAC TTATACCATG AGAGGTTGCA ATCAGCCATT GATTGTTCGG 
3351 TTTGCTGAGC CAAAGAGGCC TAAACCTGGC GAGTCAAGGG ACATGGCACC 
•3401 TCCTGTTGGA CTTGGTTCAG GGCCTCGTTT TCAAGCTTCA GGACCAAGGC 
3451 CTACCTCTAA CTTTGGTGAC TCTAGTGGGG ATGTAAGCCA CACAAATCCT 
.3501 TGGCGTCCAG CTACTTCACG AAACGTAGGC CCACCTAGTA ACACTGGGAT 
3551 CCGTGGTGCC GGTAGTGACT TTTCCCCTAA ACCAGGTCAA GCAACATTGC 
3601 CTTCAAATCA GGGTGGCCCG TTAGGTGGTT ATGGTGTTCC TCGCCTTAAC 
3651 CCTCTCCCAG TCCCTGGAGT TTCATCTTCT GCCACATTGC AACAGGAAAA 
3701 TCGGGCAGCT GGCCAGCATA TAACACCATT AAAAAAACCT CTTCACAGTC 
3751 CACAGGGTCT CCCTCTCCCC CTCCGTCCGG AAACTAATTT CCCTGGGGGC 
3 801 CAGGCACCCT TGCAGAATCC TTATGCTTAT AGCAGCCAGT TGCCTACCTC 
3051 TCAGCTGCCA CCACAGCAAA AC ATCAGTCG TGCAACTGCT CCTCAAACTC 
3901 ctttgaacat TAATCTACGG CCA.ACAACTG TGTCTTCTGC AACTGTTCAA 
3 951 TTTCCCCCTC GT'L'CCCAGCA GCAACCGCTA CAA/VAGATGC AACATCCTCC 
1001 TTCTGAGCTA GCTCAGCTCT TGTCGCAGCA AACTCAGACT CTACAAGCAA 
1051 CATTCCAATC G'l'C'l'CAGC AA GC AATTTCTC AGCTGCAGCA CCAGGTGCAG 
1101 TCTATGCAGC AACCAAACCA AAATTTACCA CTCTCACAGA ATGGCCCAGC 
1151 TGGTAAACAA CAGTGGGCTG GATCTGCAAT TCCAAGAGTG GCTAGCACCA 
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Ei qur<? _Ra 

1 GTTAACGGAT CACACAGTAT AATATAAAAC TAGGTGTTTT GCCCGCACAT 
51 GCGAGCATAA TTTTCTCATC AATACTTATT AGTTTATATC T-TATTAATCT 
101 AAAACCAGCA TGATAAGTTA TTATTTATGT TTTCAGATAG TTAAATCAAA 
151 CATCAAAGTA TTTATATATG TCAAATATTT TATCAAAAAT ATATACTTAT 
201 TATTGTGTTA AATTTTTTAA AAC AC TC ATA TCTTAGAAAT AGTTTAGAAA 
251 ATATCTTTAT ATAAATGTTT TTTAACTTTT ATAATAAAAA TATTGTTTTC 
301 AGATAGCAAC AAAATATATA TAGAATTAAC TTATTTTTAA ATTTTTTGAT 
351 AATTTTATTA TATTATTTAA GAATCAATTA TTTATATTAA TATAACATAT 
401 AATTTTCACT GATTAAATAA AATTC GTTTT TAATTATATA AATTCATTAA 
451 GAGTATTGTT TTAATAACAC ATTAGCGAAC ATCAGCTAGA AATTAATAAT 
501 AAATCAATAA CCTAGCTAAA AGTCTAAAAC CTAATAAAAT ATGACAAATA 
551 AGAAAAATTA ACTAAATTTT AATATAAAAT ATAAATTTAA T ATT ACT AAA 
601 ATAAAATTCA TTTTTAATAT ATATAAGATT CTTAAGGGTA TTTTTTAAAT 
651 TAATAAATTA GTGACTTAGC TAAAAATAAA TAATAAATCA ATGATTTAGC 
701 TAAAACCTAA TAAAAACATG AC AAAT AAGC AAATTTACTA AAATTATTGA 
751 TAATATAAAA TATG ATTG AT TCTTTAATAC AAAATTAAAA TAAGAGTTTT 
801 TTTAAATCAA ACATAAGTCT GCCGTATCGG TGTTAAAAAA AAAATCATTA 
851 ATAGTGTCGT AGGAATTATG TATTTCCATT AGCGAATAAA ATTG AAGC AG 
901 AGTGTTGGAG GATAGCTCAA CGTATAGGCG AGATTATGGA GATTGATATG 
951 GGAGTTGCCG TAACGGACAC AACTGTTCCT CTGCAAAGAA ACGCTCCTAC 
1001 TAGATGGACA TGTCAAGTTG ATGCATCCTG GATAAATGAA AGAAACATAT 
1051 CTGGACTTGG CTTTGTGTTA ATGGATGGTG ACTTCCCAAT ACTGTTTGTA 
1101 TCAACGGCCG ATATACACGC ACCAAATCAC CACTGCAAGC GGAACGTGAA 
1151 GGTTTGCTAT GGGCAATGCA AGAGATACTG AAGTTTGGAC GCAGAGTGAT 
1201 GGTCTTTCAA TCGACTATGA ACAACTGGTT ATACTCATTC AAAAGGAGGA 
1251 AGATGGCCTG CTTGGACTCG GAGCTCGACG AAATACAAGT TGTATCAAAG 
1301 AATTTTCTG A AATTTCTATT GCTTATATTC CTAGATCTTT AAAATTCCGT 
1351 ACGAATAGCC TAGCAAAAGG TGTCGATCAC CCGCATCACG ATCAGCTTTT 
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Figure 8a Continued 

1401 GTAACCCTTT GCACCAGTGG CTAGCCCACA GCTAGCATGA GGGTGCAAAA 
1451 GAGAATAAGT CGAAACAAGC TAGCATGAGG GTGCCAAAAA AG AG AATAAA 
1501 GTCGAAAGTA AAACTGAATA TCCAATGAAC AAAATTATCA GAAATCCATA 
1551 TTTATGTGGA TGTCTATATG GGACAAACAA TTTTTTTAGA TCAATCCTAA 
1601 AATATATAAT TTAAAAAAAC CATTTAAACA AACCATCAAA ATTTTGAATA 
1651 TTACACCAAA AAAAAATATA AAGACCAACT ATATTATATT CATGTATAAT 
1701 GTGTAGTGGT AAGATTCAAA AAAATTAACT TACTTTACAG TAAGGGAAAA 
1751 TTAGATTTTT TATTCCATAT TTACAGTAAA AACATAACAT TTTATAAAAC 
1801 TAAACAATTG ACATAATAGT ACAAAATATG AAAAAAAAAT CAAAATACTA 
1851 AGAACCTACT ATT AG TT AAA TTaAGTACAG TCAAGTCAAC TAGTATGTGA 
1901 ATGAGATTTA ACTTACAAAT TCATTACGAG ACAATAGCAC ATTTAGAAGA 
1951 ATAACATGTA GATTGATGTG CACACAAAAA AAAACCAACG GGTACAAATG 
2001 TTAACCGCTC CACCGGTCGA ACCATAATCC AGACCGGTTT TGCTATTTAA 
2051 ACCGCTCAAA TCGCAAAGTA CGTTTCGCTT ACTTCCAGCA AACCACCATT 
2101 GATCTCTCCT CCAATTCACA AATCCAATTT CTCTAGGGTT TGATTTCTTC 
2151 GACTTGAATT GCATTTCCAT CCGAATTTCC CCAAATTCGT CAAGCTGGAT 
2201 AGGCACCGAG GGGATCGCCA CGAGAGTGCC TTACG ACG AT TCCTACCGTA 
2251 ACCACCGAGC TCACCTAGAG GTCCCCCTCT ACTCTCAGAT TCACCCTCCA 
2301 TGTCACGTTT CGGCGAGGAT GACGAAGGTT TCAGCCGCCG TCGTCGCCGT 
2351 GGAAACAGCC TAGCAATTAT CAGTTGGATG GGACAGAGGA GGTGGCGATC 
2401 G ACGCTGGG A AGATGACGGC CACGATCGTA TTTC AC AG AG AGGCGTGGGA 
2451 GAGTAGAATT TCAGCCTATG GGTTATGGCT TCGACGGAGG TTTTCCGCCG 
2501 ATGAGTCGCG ACGGAGGATT TTGGCCTAAC GTGCCAGTGA ATTTTCCGCC 
2551 ATCGGAAAGT CCAGA'TGC AG GGGG ATATTC CGGCGGCAGG GGATTTCAAT 
2601 CAACGGGGCC TC-CTTACTCT GTGAGATTGA CTTCACCGCC GATCGAGCAG 
2651 CCTC'rriKri'G G'PCAGAAAAG AGGTCGTCCT CTCTCGG AGC AGAGTAGCTT 
27 01 TACTGGAACT GG'TAAGCTTG GGC'l\:ACTCT ACTGTAATCG AGTTGTTTAG 
27 51 AGTl'AACAGT GGT'rcATTTT ATACTTGTAT GTGATAATCA GGCTATTTCC 
2801 AAACTAAA'l^ ACCTTTACTG GATCATTCGT TTTGC AG ATT TACTGATAGT 
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Figure 8a Continued 

2851 AGCAGTATGG TGAAGCTTTT TGTTGGCTCT GTACCAAGGA CAGCTACAGA 

2901 AGAAGAAGTG AGTTCATCTT TTTCTTATTT TCCTAATTTC TTCTCAATAT 

2951 ATATGCACTT TCTTGAGGCA ATCTAAACCA CGAAGCTCGT AGACTCTGTT 

3001 CATAAGCCGT TCTTGTTTAT CATTTTGGTT TTCATAGGTC CGTCCCTTTT 

3051 CGAACAACAC GGTAAATGTT CTTGAGGTTG CTTTTATCAA GGACAAGAGA 

3101 ACAGGACAGC AGCAAGGTAT GTTTATCTCC ATTTTACTAG GAACAGTCGT 

3151 GATTTATGCT TCTAAATTTT TCAGGTCTCC TGAAAAGGCT GATGGGAACG 

3201 AACCCCAGTC TCATCATTGG CCTCCATTAG TTTTCAACAA TTTTCGGGCT 

3251 TTTGCTTATG CTAGCGAGCG TCTTATCTGT GTTGCTTTGG CACAGAAGAA 

3301 GGCTGCCTGT TTAGTTTACT AAGAAAGGGT TTTTGTATTG ACCTTGGTAA 

3351 AATAGTTTTT GCGACTTGTG TCCATCCTAG AACCTTAGTT GTGTTTGAAC 

3401 AGTGTAGCAG ACTTTATCAT GTTTTAGAGT TGGAGTTAAT GTACATAAAA 

3451 TTGAACAGAT GTTTTACTGT TGCCTTTTAG TTGGCACTGG TTTAAAGAAC 

3501 GTTGTTTTCT CCTTTCCTAT TGAATTCAGT ATCTCTTTAC TCTTCCTTTC 

3551 GATGAATGAA AATGGTGTAT ATGGTCTTGA CTGGATGAAT GTATTTTTAC 

3601 TTGGTAGTCT TACAACGTTC ATAAAATGGT TTGATTGATA AACCACCCCT 

3651 GCTAGTCAAT ATTTGGCAGT TTCTTAGTGA TTATATCATG TTGGATGTTT 

3701 TGTTTCTTTA GTTTCTTTAA TATCAACTTT GGATGTACCG TCTCTATTGG 

3751 TTGATGATGA AATTATTTTT TACCATTTTG GATGCTTGAT GCCTTAATGA 

3801 ATGGATCTTT CCTTTTTTTC TTATTGTGGA TGGCCGAGGA ACTATAATGA 

3851 ATGTTCTCTT CGCTTTTTTT GAATGGCCTG GGATGTGGAC TTCTTGTATG 

3901 TTCTCACTTT CATTGATGAA TGACTGTTTC GTTCAGTAGC TCTATTGTTC 

3951 TGTATGGTAA CGCTAACACT GCTGATCTAC ATTATGTGGA AGAGATCATA 

4001 TCTCTAATGA TATT1TTTTT CTATGTACCT TTCACCAACC AAGCTCAAAA 

4051 GCI1X3GTTTC AGTTTTTAGT GGTCTTATTC TATATAGAGC TTGGTTTCAG 

4101 rm'TAGTGG TCTTATTCTA TATATTGAGA 'ITGCTCTTCA AAAATTCCAT 

4151 CAAAGTTCTC TCTTTATGAT GCAAGTGTGA AGAATTACTA GATGATGAGT 

4201 GATGTATTAT TTAATAATTC GGGACTTTCC AAGAAGTTAT TGTACGGTGA 
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Figure 8a Continued 

4251 CATAAAAGCT TTTACTCATC CCG TTATC AC GGTTTGACTG TAGTAGATTT 

4301 GACACATTCC TTGGTTTGAA ATGTTACATG GTGCTAAGAT ATGGAAGGCA 

4351 ACGATTATTA TAATTTCTTA GAAATACGTC TTAGCTTTCA CTCGCTCTCA 

4401 TTGCTTCGAT CAGCATCAGG CATGAGCCGC CTTAGTATGT ATTTAATGAA 

4451 GCAAGTGTCA TTCTTCTCTA TATGCAACTA TTACCAATGA ATTGACGTTG 

4501 GGTTGTGGTT ATGTCTCTCA GAACTGTAAT TCTTTTTGTG AATGTCGTCA 

4551 AATGTGTGGT GTATGTTGTA TGGTGTATGG TGACGAAAAT GTGATGTATG 

4601 GCTCTAGTTT TAATTATATC ATTTGTTACT TAGCAGTGAT TGAGAACTCT 

4651 TAACTTGTAA TTTTATCTAA TTTTTTTTTG CAGTGATTGG ATTCTTTTTG 

4701 CGTAATATAT ATTTTATTTG CAAATACCGA CTGTGTTCTT TTTAAATAGT 

4751 TTAAAGGCAT ATGCTTTATT TGAAGCACAT TAG TTTATTA TTCTCTCCAT 

4801 CAAATCTACT ACAGTAATGT AAGTCGAGGC TGTCAGGACA TGTCTTATGA 

4851 TTTTCGTACT GAAACTTATG TGCTTTCAAT GTGGTCGTGG CTTGTACATT 

4901 TGTAAAGAAA CTATTTACTA GTATCTCTTG ATGTTTGATG GAGGGACAAG 

4951 TGGAACCTTG AACAGAAGCT TATGTAGCAG TCTTTAATGC AGGCTGTTGT 

5001 TTTGTAAAAT ATGCAACTTC TGAAGATGCG GATAGGGCCA TTAGAGCATT 

5051 GCACAATCAG ATCACTCTTC CTGGGGTAAC TACCATTGAT GCCTTCTCTT 

5101 ATCAAGGACA GGAAAATACA GGTTAACTCT ATCTTTACAA TTTGCTGATT 

5151 CCCAGGGAAC TGGCCTTGTT CAAGTTCGAT ATGCTGATGG GGAGAGAGAA 

5201 CGCATAGGTA ATCAACTTTC GCGCCATATT ATCTGAATCT GGCCTTCATT 

5251 GTCTGGTATA CATAGGGTGA CCATACGCTG T AC AAATTC A AATT ACG AG A 

5301 ATTGAGATAA TGTGGG AAAC TATATGAATC TTAAGGAAGT GGATCCTTTT 

53 51 TTCTGTGGTC CTTGCCTCAC TCTCAAGTAT TAACTGATTG AATTTACTTC 

54 01 TTCTGAAGGT GCGGTAGAGT TTAAGCTTTT TG TTG GTTC C TTAAACAAGC 
5451 AAGCCACTGA AAACGAGGTT GAGGAGGTAT GTCTC AT ATC CTACTTTTTG 
5501 ATGGAAAGTA ATTACTTATG TCTGATTTAC A A AG AGGG A A GCGTTCTAAA 
5551 TTT AG AT ATT ACAGTATCCC CTGTCGCCTT AGCTGGTAAT TTTAGTGATT 
5601 AT ATG AC A AT TTAGTAGTCC TCTTGGAAGG GTCAGCGGCT TGAAATTTTG 
5 651 TGTCAACTAT TCGAGCGCTT AC AC ATTTT A CTAACTGAGT GATCTCTTCT 
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Figure 8a Continued 

5701 TTCAAATGGA CTGACTGAGT GATCTCTTCT TCCAAATGGA TGTAACTTTT 
5751 TGGCTGTCAG CTTTCTTTTC TCAGTAAATA TGATGAAGAT GTGAACGGCT 
5801 ACTTTGTCCT GTTGTTGCTT TAACAGCTCT TTTTGCAATT TGGTCGCGTG 
5851 GAGGATGTCT ATCTCATGCG TGATGAATAT AGACAGAGTC GTGGTATGTC 
5901 TGGTAACTGC CACTAGACTC TATAACTCGT TTGATGGTGT TGATATGGTC 
5951 AAACTGTTTT TGACACTCAT TTAGGATGCG GGTTTGTTAA ATATTCAAGC 
6001 AAAGAGACGG CCATGGCAGC TATCGATGGT CTCAATGGAA CTTATACCAT 
6051 GAGAGTAAGC TGTGAAATCA CATGAGTATC TCACTTTCTC TCATTATCCC 
6101 CTCTAGACCT GTTTTGTTTA CTGGCCTCTT TCCCTTCTCC AGGGTTGCAA 
6151 TCAGCCATTG ATTGTTCGGT TTGCTGATCC AAAGAGGCCT AAACCGGGCG 
6201 AGTCAAGGTA TTGCCTTGGA GACTATATTT TGAATTCATT ATAATGCTAA 
6251 TATCAAAAAA ATTGTGTCTA CTGTCATTGT TTGTTCTATT GAGTTACATT 
6301 TATGAGAATC TTTTGGGGCA TGGGTGGAGG AGAGCTGCGA ACCTTATTCC 
•6351 TTCTCC AG TT ATTACTTGAA TGCGATGAAT TTCTTTCTAT ATATCCTTAA 
6401 CTAGTTTCTG TTTCCAGGGA AGTGGCACAT CCTGTTGGAC TTTGTTCAGG 
6451 GCCTCGTTTT CAAGCTTCAG GACCAAGGTG ACTGGGGTGA AAGGAGATCG 
6501 TTGTTTTTGT CATCAATTAA TTATATATTT TGACTAAACG TGGTCTCCTT 
6551 ATCTTCATTT GTTAGGCCTA CATCTAACCT TGGTGACCTT AGTGTGGATG 
6601 TGAGCCACAC AAATCCTTGG CGTCCTATGA ATTCACCAAA CATGGGGCCA 
6651 CCTGGTAACA CTGGGATCCG TGGTACCGGA AGTGACTTGG CTCCTAGGCC 
6701 AGGTCAAGCC AC ATT ACCTT CAAATCAGGT AAGAACAGCT TGATGATCAT 
67 51 GTATATTATC TTATATGTAC ACACCCAATC AC AC AT AAAG TAATCGGGCA 
6801 TAAGGTTTTA CATGTATTGT GTGAGTAGGA CG AAC AT A AT TTATATGCTG 
6851 C AC AT AT A* 1X3 AGCGTATGGA CTCTTGAAAA GAAGCATGAA GTTCCGACCT 
69 01 TCCAGCTTTT CATATGATGC AGC AAACTTG ATGTGTTTTG CATTGAAATG 
69 51 ATATGGCTTT GATTTGCATT TTGTC AGTTT CTAAGGAGTT TTTTTCTTCA 
7 001 ATAATTTCTA CTTCTGATGT TAGCTTTATT TGTGGCATTC TATAATGTTA 
7 051 GGGAGGTCCA TTGGGTGGTT ATGTTGTTCC TGCCATTAAC CCTCTACCAG 
7101 TCTCA1X:C*LX: TGCCACATCG CAACAGGTAC TTCAGCTGAA TTTTTCCAAT 
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Figure 8a Continued 

7151 AAAGAAAATC TGAAAATGTT GTGTTGATCA GTTAATTTCA ACTGTTTCTA 
7201 TTCCATAGCA AAACCGGGGA GCTGGCCAGC ATATGTCACC ATTACAAAAA 
7251 CCTCTTCACA GTCCACAGGA TGTGCCCCTT CG ACC AC AAA CTAATTTCCC 
7301 TGGGGCCCAA GCATCCTTGC AGAATCCTTA TGGTTATAGC AGCCAGTTGC 
7351 CTACTTCTCA GCTGCGGCCA CAACAAAACG TCACTCCTGC AACAGCTCCT 
7401 CAAGCTCCTT TGAACATCAA CCTACGGCCA ACACCTGTAT CTTCTGCAAC 
7451 TGATCAATTG CGCCCTCGTG CTCAGCAGCC ACCGCCACAA AAGATGCAAC 
7501 ATCCTCCTTC TGAGCTAGTT CAGCTCTTGT CACAACAGAC TCAGACTCTA 
7551 CAAGCAACCT TCCAATCATC TCAGCAAGCA TTTTCTCAAC TGCAGGAGCA 
7601 GGTGCAGTCC ATGCAGCAAC CAAACCAAAA ATTACCAGGC TCACAGACTG 
7651 GCCATGGTAA ACAGCAGGTA CAAACATAGT TCCCTGTTGC ATCTGTCCAG 
7701 TCCAGTTCCT CAGCTGTTTT TGTTGTTTTA ACTTACAATT ATTTCC TG AT 
7751 GTCTAAGTAT TCAATCCTTC ATATATTTTA GTAGTCCCTC TTTTTTATTA 
7801 TGTTTTTCTC GGTTGCTTCT CTATCAGTGG GCTGGATCTG CAATTCCGAC 
7851 AGTTGTTAGC ACCACTGCTT C T AC ACC AG T TAGCTATATG CAAACAGCTG 
7 901 CACCTGCAGC AACTCAGAGT GTTGTTTCTC GCAAATGTAA CTGGACCGAG 
7 951 CATACCTCGC CTGATGGATT TAAGTATTAT TACAACGGTC AAACCGGTGA 
8001 AAGCAAGGTG AGAAACGTGG TTCCTCTTTA GTTATGTTCT CTTGTGAGTT 
8051 TCAGGAGGAT TCCTTGTATT TGCTGTGCTA TTTATTATCC TTGAACATGT 
8101 ATATGTATAG ATTTC AT ATT TGAAGTTCAT CAATACGTGT CGTAATATAA 
8151 TTG ACTTTTG CAGTGGGAAA AACCTGAGGA AATGGTATTG TTCGAACGTC 
8201 AGCAACAGCA GCCAACTATA AATCAGCCCC AG ACCC AATC ACAGCAGGCT 
8251 CTTTATTCCC AGCCGATGCA GCAAC AACCA CAACAGGTTC ACCAGCAATA 
03 01 TCAGGGCCAA TATGTAC AGC AGCCTATTTA TTCTTC AGTG GTTGGTTCTG 
8 351 TTTTCTTGCT GCTTACATCC ATATAGTTTT CTC AC ATGGT CTCTAACTTG 
8401 AATATGTATT CTTTVCCATT TGGAGTTGCA GTATCCAACT CCAGGGGTCA 
0451 GCCAGAATGC TC AG G TG T A ' V ATTPACrTAA ATTATTIXSCT TATCTTTCAT 
8501 TTCAGAATTT G ATC ATTG AG TTACC AATCT AGTGGGTATA AGGAGACGGG 
0551 CCACTTTATG CAATAAACCA TGGTTTT AC A AGCGTTTTGA AT AT AC AG AT 
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Figure 8a Continued 
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Figure 8b 
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Figure 9 
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